Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-11673 tests: replace obsolete '-o' to '||' 70/33670/8
James Nunez [Thu, 15 Nov 2018 23:28:07 +0000 (16:28 -0700)]
LU-11673 tests: replace obsolete '-o' to '||'

Since use of -o and -a are marked as obsolete in shell
test ([), we need to switch from using [ expr1 –o expr2 ]
to [ expr1] || [ expr2 ].

Make this change for sanity tests.

Test-Parameters: trivial
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Id87580d0280a716a6939a1203ae5b370e762d6ec
Reviewed-on: https://review.whamcloud.com/33670
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11355 lustre: enable fstrim on lustre device 31/33131/12
Wang Shilong [Thu, 11 Apr 2019 00:40:23 +0000 (08:40 +0800)]
LU-11355 lustre: enable fstrim on lustre device

pass the FITRIM ioctl through the OST/MDT
mountpoint to the underlying filesystem, which
allows us to run fstrim on server mount point directly.

Change-Id: Ia6f9b43e48245ee7907a47f05c3924b3640bc734
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/33131
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11796 lov: Remove unnecessary assert 82/33882/6
Patrick Farrell [Fri, 29 Mar 2019 19:01:01 +0000 (15:01 -0400)]
LU-11796 lov: Remove unnecessary assert

This is asserting on network data from the server, and
additionally, the LU-9846 (overstriping) work shows this
condition is not a problem if it does somehow occur.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7b53eb63914f6e9d31a0747a40d09df9ffedaa91
Reviewed-on: https://review.whamcloud.com/33882
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11691 lov: Limit layout size to max ea size 71/34171/4
Patrick Farrell [Fri, 29 Mar 2019 19:00:53 +0000 (15:00 -0400)]
LU-11691 lov: Limit layout size to max ea size

The layout code does not currently prevent the creation of
layouts which (once instantiated) will exceed the maximum
xattr size.

This patch modifies the code which calculates the maximum
allowed stripe count for a component to also evaluate the
full size of new layouts and report a count of zero if the
new layout is too large.  The server will then return
 -E2BIG to the client asking for such a layout.

Unfortunately, it's not practical to test this without
overstriping.  LU-9846 adds tests for this.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2ead7702780b2600cf09485e06393ee9bcfb4a1e
Reviewed-on: https://review.whamcloud.com/34171
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11868 osd: Set max ea size to XATTR_SIZE_MAX 58/34058/16
Patrick Farrell [Fri, 29 Mar 2019 19:00:15 +0000 (15:00 -0400)]
LU-11868 osd: Set max ea size to XATTR_SIZE_MAX

Lustre currently limits EA size to either ~1 MiB (ldiskfs)
or 32K (ZFS).  VFS has its own limit, XATTR_SIZE_MAX,
which we must respect to interoperate correctly with
userspace tools like tar, getattr, and the getxattr()
syscall.

Set this as the new max EA size for both ldiskfs and ZFS.

(The current 32K on ZFS is too small for
LOV_MAX_STRIPE_COUNT [2000] files, so needs to be raised
regardless.)

In order to use this correctly, we have to use the real ea
size on the client.  The previous code for maximum ea size
on the client (KEY_MAX_EASIZE, llite.max_easize) used a
calculated value based on number of targets.

With one exception, the mdc code already uses the default
ea size rather than the max.  Default ea size adjusts
automatically to the largest size sent by the server.

The exception is the open code, which uses the max so it
never has to resend a layout request.  This patch changes
it to use default, which means that the first time a very
widely striped file is opened, the open will be resent.

Add limit checks on client & server so the xattr size limit
is honored.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4da62691f30fa276d20959810116cf558cccc515
Reviewed-on: https://review.whamcloud.com/34058
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10777 dom: disable read-on-open with resend 00/34700/4
Mikhail Pershin [Mon, 22 Apr 2019 18:18:01 +0000 (21:18 +0300)]
LU-10777 dom: disable read-on-open with resend

The read-on-open can fill more data on reply buffer than
client allocated, this causes buffer re-allocation followed
by resend. Meanwhile FIO read test shows that such resends
perform worse than separate READ RPC. For example:
FIO 8k read is ~50% better without buffer re-allocation
with resend. Considering that there is parameter on MDC
'mdc_dom_min_repsize' to control read-on-open inline buffer
size, there is no sense to keep 'reallocation+resend'
option on MDT. Patch removes it.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I7eb9d64f5551789e93b1f7676f61c0e7a5149f76
Reviewed-on: https://review.whamcloud.com/34700
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11678 quota: make overquota flag for old req 45/34645/4
Hongchao Zhang [Fri, 29 Mar 2019 13:28:06 +0000 (09:28 -0400)]
LU-11678 quota: make overquota flag for old req

For the old request with over quota flag, the over quota flag
should still be marked at OSC, because the old request could be
processed afther the new request at OST, then it won't break the
quota enforement at OST.

Change-Id: Ic34c438fe3f018c3b596b26ad6dc945547c8fada
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34645
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shilong Wang <wshilong@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12168 utils: obdfilter fix for SHORT msgs 10/34610/3
Alexander Boyko [Mon, 8 Apr 2019 07:55:24 +0000 (03:55 -0400)]
LU-12168 utils: obdfilter fix for SHORT msgs

Sometimes obdfilter-survey shows SHORT instead of min,max.
This could happen when two signals for a parent process comes
during a verbose time. The counters are updated and start_time
is dropped. By default timeperiod is 1 second.

ost  1 sz 16777216K rsz 2048K obj    4 thr    8
write 3662.99 [4286.00,4528.95] rewrite 3873.87 [4746.85, 4857.48]
read 8088.39      SHORT

The patch fixes this issue and drops counters and time when
statistics are printed or all threads are started.

Obdfilter-survey can print SHORT after patch when subtest time
is too small 1-2 seconds. The detail log shows this case as

Total: total 8192 threads 4 sec 1.692006 4841.590396/second

Test-Parameters: trivial
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-7110
Change-Id: I9b1521c23e9360216a279ab5c28c39bcaca9974b
Reviewed-on: https://review.whamcloud.com/34610
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12152 lnet: Cleanup lnet_get_rtr_pool_cfg 91/34591/5
Chris Horn [Thu, 4 Apr 2019 02:40:58 +0000 (21:40 -0500)]
LU-12152 lnet: Cleanup lnet_get_rtr_pool_cfg

The cfs_percpt_for_each loop contains an off-by-one error that causes
memory corruption. In addition, the way these loops are nested results
in unnecessary iterations. We only need to iterate through the cpts
until we match the cpt number passed as an argument. At that point we
want to copy the router buffer pools for that cpt.

Cray-bug-id: LUS-7240
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I8c0dc7bab7ca42dbce04a9e6efa4343da4139239
Reviewed-on: https://review.whamcloud.com/34591
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11845 osd-zfs: Support encrypted ZFS datasets 99/33999/10
Nathaniel Clark [Wed, 9 Jan 2019 20:43:59 +0000 (15:43 -0500)]
LU-11845 osd-zfs: Support encrypted ZFS datasets

Call zfs::dmu_objset_own and zfs::dmu_objset_disown with
decrypt=B_TRUE

This is called the same way as in zfs modules.

Fixes: 0fedb017c1 ("LU-9890 osd-zfs: dmu_objset_own/disown changes")
Test-Parameters: envdefinitions=ZFS_MKFS_OPTS="encryption=on -o keylocation=file:///etc/adjtime -o keyformat=passphrase" testlist=sanity fstype=zfs
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I1d9bc1a579ac26706a9f6cc5a0d52649ce005228
Reviewed-on: https://review.whamcloud.com/33999
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12040 mdc: reset lmm->lmm_stripe_offset in mdc_save_lovea 71/34371/7
Alexey Lyashkov [Mon, 4 Mar 2019 14:46:33 +0000 (17:46 +0300)]
LU-12040 mdc: reset lmm->lmm_stripe_offset in mdc_save_lovea

In order to prepare for replay lmm->lmm_stripe_offset (which contains
layout generation) has to be set to -1 (LOV_OFFSET_DEFAULT) in order
to not confuse lod_verify_v1v3

Fixes: f90abfdc96 ("LU-169 lov: add generation number to LOV EA")
Fixes: 89693927f0 ("LU-8998 lod: accomodate to composite layout")
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-7008
Change-Id: I911d3c659b6c11cc8847f0517062dd8e4df89dff
Reviewed-on: https://review.whamcloud.com/34371
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12178 osd: do not rebalance quota under memory pressure 41/34741/2
Alex Zhuravlev [Tue, 23 Apr 2019 14:51:28 +0000 (17:51 +0300)]
LU-12178 osd: do not rebalance quota under memory pressure

this will happen eventually.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ibe4ef9e45deed5ea19169f3affed322351785357
Reviewed-on: https://review.whamcloud.com/34741
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11986 lnet: properly cleanup lnet debugfs files 69/34669/5
James Simmons [Mon, 15 Apr 2019 23:16:27 +0000 (19:16 -0400)]
LU-11986 lnet: properly cleanup lnet debugfs files

The function lnet_router_debugfs_remove() is suppose to cleanup
the lnet specific debugfs files but that is not happening at all.
Change lnet_remove_debugfs() from doing the final debugfs lnet
and libcfs cleanup to doing specific debugfs file removal. We
can make libcfs module unloading to directly finish the entire
libcfs and debugfs tree removal instead. With this change we can
make lnet_router_debugfs_fini() call lnet_remove_debugfs().

Change-Id: I9e314e7efde806073b621166ff2e1b344e550875
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34669
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12037 mdt: add option for cross-MDT rename 10/34410/7
Lai Siyao [Mon, 4 Mar 2019 15:56:16 +0000 (23:56 +0800)]
LU-12037 mdt: add option for cross-MDT rename

Add option mdt.mdt_remote_rename, if it's not set (it's set by
default), do cross-MDT rename as cp, this is used for debug or
user want to move inode in rename.

Add sanity test_24z.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ia0d122f1716f17078b375f770a193347a6e50708
Reviewed-on: https://review.whamcloud.com/34410
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11931 lnd: bring back concurrent_sends 96/34396/5
Amir Shehata [Thu, 21 Mar 2019 15:53:34 +0000 (11:53 -0400)]
LU-11931 lnd: bring back concurrent_sends

Revert "LU-10291 lnd: remove concurrent_sends tunable"

This reverts commit 8d35d6c9bd85ed3a282aa124b672e50c02322a7d.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Icb63d7383f0d2a3cab82c1565f66670dca1f698d
Reviewed-on: https://review.whamcloud.com/34396
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11986 lnet: Avoid lnet debugfs read/write if ctl_table does not exist 22/34622/2
Sonia Sharma [Mon, 1 Apr 2019 12:40:27 +0000 (05:40 -0700)]
LU-11986 lnet: Avoid lnet debugfs read/write if ctl_table does not exist

Running command "lctl get param -n stats" after lnet
is taken down leads to kernel panic because it
tries to read from the file which doesnt exist
anymore.

In lnet_debugfs_read() and lnet_debugfs_write(),
check if struct ctl_table is valid before trying
to read/write to it.

Change-Id: I2450d2f89c2e8a7db793680a4df581282ee46a16
Signed-off-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34622
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11779 tests: add version check for sanity-hsm 89/34589/2
James Nunez [Wed, 3 Apr 2019 19:49:16 +0000 (13:49 -0600)]
LU-11779 tests: add version check for sanity-hsm

sanity-hsm test 255 was added to Lustre tag 2.12.0.
sanity-hsm test 260c was modified with Lustre tag 2.12.0.
Thus, we need to check that the server is 2.12.0
or later before running these tests.

Fixes: e7d5c1681c07 (LU-11653 hsm: copytool registration wakes the coordinator)
Fixes: b84bc6d895a0 (LU-11572 tests: make sanity-hsm test_260c reliable)
Test-Parameters: trivial serverjob=lustre-b2_10 serverbuildno=168 testlist=sanity-hsm
Test-Parameters: testlist=sanity-hsm
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I1a5369ec864432a241a875c3430baa5a064b0dfe
Reviewed-on: https://review.whamcloud.com/34589
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-947 ptlrpc: allow stopping threads above threads_max 00/34400/6
Andreas Dilger [Tue, 12 Mar 2019 08:12:03 +0000 (02:12 -0600)]
LU-947 ptlrpc: allow stopping threads above threads_max

If a service "threads_max" parameter is set below the number of
running threads, stop each highest-numbered running thread until
the running thread count is below threads_max.  Stopping nly the
last thread ensures the thread t_id numbers are always contiguous
rather than having gaps.  If the threads are started again they
will again be assigned contiguous t_id values.

Each thread is stopped only after it has finished processing an
incoming request, so running threads may not immediately stop
when the tunable is changed.

Also fix function declarations in file to match proper coding style.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I106f841e62c26b488ae837564c858a44263ebbe5
Reviewed-on: https://review.whamcloud.com/34400
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-930 doc: man page for l_getsepol 84/34184/5
Sebastien Buisson [Tue, 5 Feb 2019 15:06:39 +0000 (16:06 +0100)]
LU-930 doc: man page for l_getsepol

Man page for l_getsepol.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I338492cebca9a088657ff8bd5122274e7e49a5c7
Reviewed-on: https://review.whamcloud.com/34184
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
4 years agoLU-930 doc: man page for lctl nodemap_set_sepol 84/34084/8
Sebastien Buisson [Mon, 21 Jan 2019 16:07:48 +0000 (01:07 +0900)]
LU-930 doc: man page for lctl nodemap_set_sepol

Man page for lctl nodemap_set_sepol.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9e27aaa7d5653fcd6225a424bdbb920471b01555
Reviewed-on: https://review.whamcloud.com/34084
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-11213 uapi: reserve connect flag for plain layout 56/34656/2
Lai Siyao [Fri, 22 Mar 2019 18:45:34 +0000 (02:45 +0800)]
LU-11213 uapi: reserve connect flag for plain layout

Reserve OBD_CONNECT2_PLAIN_LAYOUT flag, so that client supporting
plain layout won't enable plain layout if MDT doesn't support,
and in contrary, MDT supporting plain layout won't send such layout
to client that doesn't support.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ia629e17d83b5b48c94518de428e5abd79e5a37f0
Reviewed-on: https://review.whamcloud.com/34656
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10092 pcc: Reserve a new connection flag for PCC 56/34356/3
Qian Yingjin [Fri, 1 Mar 2019 07:16:09 +0000 (15:16 +0800)]
LU-10092 pcc: Reserve a new connection flag for PCC

Reserve OBD_CONNECT2_PCC connection flag that will be set
(in ocd_connect_flags2) if a Lustre server or a client supports
Persistent Client Cache (PCC).

Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibe20c668a649be69475dc326ce56dc8708772d32
Reviewed-on: https://review.whamcloud.com/34356
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
4 years agoLU-12021 lsom: Add an OBD_CONNECT2_LSOM connect flag 43/34343/4
Qian Yingjin [Thu, 28 Feb 2019 08:05:29 +0000 (16:05 +0800)]
LU-12021 lsom: Add an OBD_CONNECT2_LSOM connect flag

Add an OBD_CONNECT2_LSOM connect flag so that clients do not send
MDS_ATTR_LSIZE and MDS_ATTR_LBLOCKS flags to the old servers that
do not support them.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I266c74e56c2cb1462e204d6fd4f1399f10621416
Reviewed-on: https://review.whamcloud.com/34343
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
4 years agoLU-12175 tests: Partial revert of LU-11636 05/34705/3
Patrick Farrell [Thu, 18 Apr 2019 16:42:24 +0000 (12:42 -0400)]
LU-12175 tests: Partial revert of LU-11636

Since landing:
LU-11636/https://review.whamcloud.com/33611/
07b271c0972757772a129e9a6370dbb163f16a06

we have seen high failure rates in several tests:
sanity 208 (LU-12175)
sanity 133g (LU-12171)
recovery-small 107 (LU-12176)
recovery-small 134 (LU-11560)

Testing with a full revert of LU-11636 showed that removing
it stops those failure.  Our best guess is that the
randomization of stripe count and mdt index is the cause,
so this patch reverts just those changes.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id6fa3079e29178555827f4e1f39d51cf8d62cf31
Reviewed-on: https://review.whamcloud.com/34705
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11394 utils: Adjust HOSTID constant 78/34278/3
Nathaniel Clark [Tue, 19 Feb 2019 22:22:41 +0000 (17:22 -0500)]
LU-11394 utils: Adjust HOSTID constant

Use constant defined in spl / post-0.8.0 libspl, for HOSTID file.
Also allows get_system_hostid() to be pulled in from ZFS 0.8.x

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Iba70d4f3b7f237260bdc964b28b601deeee81208
Reviewed-on: https://review.whamcloud.com/34278
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12166 test: fix broken detection on ZFS 09/34609/2
Wang Shilong [Sun, 7 Apr 2019 03:44:51 +0000 (11:44 +0800)]
LU-12166 test: fix broken detection on ZFS

We intent to run the command on mds, otherwise
project quota will never be tested.

Test-Parameters:trivial fstype=zfs
Fixes: a046e87 ("LU-7991 quota: project quota against ZFS backend")
Change-Id: I8650a0e1065f0bb465da01556472d3d23b22a530
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/34609
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12160 osd-ldiskfs: use-after-free in osd_object_delete() 96/34596/3
Alex Zhuravlev [Thu, 4 Apr 2019 10:03:28 +0000 (13:03 +0300)]
LU-12160 osd-ldiskfs: use-after-free in osd_object_delete()

store a local copy of projid to avoid use-after-free.

Fixes: 39f63cf54c62 ("LU-4017 quota: add setting/getting project id function")

Change-Id: I60e19de3485cae3df1cc2e8aae6eeed4b5de3a11
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34596
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-930 doc: improve formatting of lfs.1 synopsis 94/34594/2
Andreas Dilger [Thu, 4 Apr 2019 07:35:38 +0000 (01:35 -0600)]
LU-930 doc: improve formatting of lfs.1 synopsis

Add proper command formatting for the lfs.1 SYNOPSIS section.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic70d6ccc7127510fd2df17cf6d70b0af8e3ebbe5
Reviewed-on: https://review.whamcloud.com/34594
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12151 osd-ldiskfs: pass owner down rather than transfer it 81/34581/3
Wang Shilong [Wed, 3 Apr 2019 09:13:16 +0000 (17:13 +0800)]
LU-12151 osd-ldiskfs: pass owner down rather than transfer it

Currently, for object creation, initially uid/gid set as 0,
and then osd_quota_trasfer() is called to correct space accounting
for non-root users, function call is like:

|->osd_create
  |->osd_create_type_f
     |->osd_mkreg
        |->ldiskfs_create_inode
           |->ext4_new_inode() ->owner as NULL, create 0 as uid/gid
      |->osd_attr_init
         |->osd_quota_transfer  ->which will change uid/gid again for above.

This is inefficient since osd_quota_transfer() is a more
heavy operations, we could just pass downer owner(uid,gid),
project quota will inherit from its' parents automatically
when creating inode.

Some distros ext4 still did not support passing @owner down,
that is (rhel6,sles11) we just added extra @owner arg in
ldiskfs_create_inode() to make build system happy, and we
could add similar support to older kernel if that is really needed.

Command:
 $ salloc -N 32 --ntasks-per-node=24 mpirun -np 768 mdtest -n 2000
-F -u -d <mnt>

Without Patch:
Users Speed
root 175741.938 ops/sec
non-root  108631.673 ops/sec

Patched:
Users Speed
root 184775.286 ops/sec
non-root  185218.466 ops/sec

Patch improved ~80% for non-root users and we reached
same speed for both root and non-root users.

Change-Id: I57b0d2a6913268448c0ed90cfe76bd9f051b0b40
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/34581
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12141 build: correct the required kernel version for lustre kmod 59/34559/3
Gu Zheng [Mon, 1 Apr 2019 09:19:21 +0000 (17:19 +0800)]
LU-12141 build: correct the required kernel version for lustre kmod

Use %kversion rather than %kver when creating preamble for lustre kmods
in lustre spec, to avoid *Requires kernel version* mismatch.

Test-Parameters: trivial

Change-Id: I9929471abd48b214510bcb499e25793ad120e6d1
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/34559
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-930 doc: man pages for lctl pool_new, pool_add 24/34524/6
Andreas Dilger [Wed, 12 Dec 2018 08:49:00 +0000 (01:49 -0700)]
LU-930 doc: man pages for lctl pool_new, pool_add

Add man pages for lctl pool_new and lctl pool_add.

More pages are needed for other commands, pool_remove,
pool_destroy, and pool_list.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie7cbb80d28610b9f74fe8f58c74c37a72e3ebbe5
Reviewed-on: https://review.whamcloud.com/34524
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12027 utils: add units to "lfs find -amctime" 67/34367/5
Andreas Dilger [Wed, 27 Feb 2019 09:25:23 +0000 (02:25 -0700)]
LU-12027 utils: add units to "lfs find -amctime"

The normal find command can only specify time arguments in terms
of whole days. The MacOS find(1) man page reports the use of unit
suffixes to give better control over the time range, such as "1d4h".

Possible time units are as follows:

    s       second
    m       minute (60 seconds)
    h       hour (60 minutes)
    d       day (24 hours)
    w       week (7 days)
    y       year (365 days)

There is no "month" specifier here or in find(1), since the length
of a month is not fixed, and it would conflict with the 'm' minutes
unit.  Units may be combined in one argument, e.g. "-atime -1h30m".

For matches that are "equal" to the specified time, they must match
within the smallest unit specified.  For example, "-mtime 24h" would
match anything within 1h of 24h ago, similar to "-size 100M" will
match anything within 1MB of 100MB.

Test-Parameters: trivial fstype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib1eb4e626b712bb75f13b075849f959f003ebbe5
Reviewed-on: https://review.whamcloud.com/34367
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11403 llite: ll_fault fixes 47/34247/7
Patrick Farrell [Tue, 12 Mar 2019 18:32:21 +0000 (14:32 -0400)]
LU-11403 llite: ll_fault fixes

Various error conditions in the fault path can cause us to
not return a page in vm_fault.  Check if it's present
before accessing it.

Additionally, it's not valid to return VM_FAULT_NOPAGE for
page faults.  The correct return when accessing a page that
does not exist is VM_FAULT_SIGBUS.  Correcting this avoids
looping infinitely in the testcase.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I53fc16d91462ac5d4555855dfa067d7fd6716c90
Reviewed-on: https://review.whamcloud.com/34247
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-930 doc: add audit_mode in lctl-nodemap-modify man page 88/34088/4
Sebastien Buisson [Tue, 22 Jan 2019 13:55:15 +0000 (22:55 +0900)]
LU-930 doc: add audit_mode in lctl-nodemap-modify man page

audit_mode is a nodemap property added by patch under LU-9727.
It can be set via 'lctl nodemap_modify' command, so add reference
in corresponding man page.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ic600756ada257e3e2cfe92a1c30e9b7342f2e4d1
Reviewed-on: https://review.whamcloud.com/34088
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11394 build: Remove SPL requirements 43/33743/15
Nathaniel Clark [Wed, 28 Nov 2018 22:31:40 +0000 (17:31 -0500)]
LU-11394 build: Remove SPL requirements

Because ZFS and SPL are version locked, and ZFS has explicit
requirements for SPL, remove lustre's SPL requirements.

lbuild: Make building spl optional when version is changed.

Test-Parameters: trivial
Test-Parameters: mdtfilesystemtype=zfs ostfilesystemtype=zfs ostcount=2
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Iedba8d4047ba1fa852a2f99db2cd1b6caff33326
Reviewed-on: https://review.whamcloud.com/33743
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10717 tests: tests should not start mgs 89/33589/27
Alexander Boyko [Tue, 6 Nov 2018 12:57:15 +0000 (07:57 -0500)]
LU-10717 tests: tests should not start mgs

The conf-sanity prolog do reformat_and_config which leaves
mgs service started, if it is not combined.
So, in general, test should not start mgs service, if it don't
stop mgs. And test should start mgs after reformat.

The client mount requires start of all MDTs, because of
MDT0000-osp-MDT000X synchronization.

Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=conf-sanity
Test-Parameters: standalonemgs=true testlist=conf-sanity
Test-Parameters: standalonemgs=true mdscount=2 mdtcount=4 testlist=conf-sanity
Cray-bug-id: LUS-2524
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I226eab5683afc36efe908b200f46b710f6235374
Reviewed-on: https://review.whamcloud.com/33589
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11601 ptlrpc: IR doesn't reconnect after EAGAIN 57/33557/9
Sergey Cheremencev [Tue, 3 Jul 2018 12:45:01 +0000 (15:45 +0300)]
LU-11601 ptlrpc: IR doesn't reconnect after EAGAIN

There is a chance that client is connecting to OST
before recovery start when OST is not configured.
In such case OST returns EAGAIN(target->obd_no_conn == 1).
There is no problem when pinger_recov is enabled
because ptlrpc_pinger_main will reconnect later.
But it doesn't reconnect when pinger_recov is 0.

Move setting imp_connect_error to ptlrpc_connect_interpret.
It is needed to store there only connection errors.

Cray-bug-id: LUS-2034
Change-Id: I35ad57e43825162f4056ad346e22a8dddea0e191
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-on: https://es-gerrit.dev.cray.com/153542
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/33557
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12165 quota: fix to use correct fsname array size 08/34608/2
Wang Shilong [Sat, 6 Apr 2019 13:23:23 +0000 (21:23 +0800)]
LU-12165 quota: fix to use correct fsname array size

Max fsname is allowed to be LUSTRE_MAXFSNAME, plus '\0',
we expected arrary size should be LUSTRE_MAXFSNAME + 1.

Otherwise, we will hit following crash easily.

[864870.292204] [<ffffffff9230e84e>] dump_stack+0x19/0x1b
[864870.293186] [<ffffffff92308b50>] panic+0xe8/0x21f
[864870.294104] [<ffffffffc0f3f805>] ? qsd_enabled_seq_write+0x205/0x210 [lquota]
[864870.295418] [<ffffffff91c91b8b>] __stack_chk_fail+0x1b/0x20
[864870.296437] [<ffffffffc0f3f805>] qsd_enabled_seq_write+0x205/0x210 [lquota]
[864870.297760] [<ffffffff91e1e418>] ? __sb_start_write+0x58/0x110
[864870.298894] [<ffffffff91e91050>] proc_reg_write+0x40/0x80
[864870.299883] [<ffffffff91e1b490>] vfs_write+0xc0/0x1f0
[864870.300765] [<ffffffff91e1c2bf>] SyS_write+0x7f/0xf0
[864870.301711] [<ffffffff92320795>] system_call_fastpath+0x1c/0x21

Test-parameters: trivial
Change-Id: I33dd331a83ddac0e0c36a82480e7e90ad0ed2c2a
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/34608
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
5 years agoLU-6142 ldlm: Fix style issues for ptlrpcd.c 04/34604/3
Arshad Hussain [Sat, 23 Mar 2019 08:21:35 +0000 (13:51 +0530)]
LU-6142 ldlm: Fix style issues for ptlrpcd.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/ptlrpcd.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Id8186c98b272e0863fc48d63abbe33e1d703c408
Reviewed-on: https://review.whamcloud.com/34604
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
5 years agoLU-6142 ptlrpc: Fix style issues for sec_bulk.c 48/34548/4
Arshad Hussain [Fri, 22 Mar 2019 10:49:26 +0000 (16:19 +0530)]
LU-6142 ptlrpc: Fix style issues for sec_bulk.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/sec_bulk.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I90ee1627561098f391776cda1958c9fd73067c51
Reviewed-on: https://review.whamcloud.com/34548
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
5 years agoLU-6142 ldlm: Fix style issues for ldlm_request.c 47/34547/3
Arshad Hussain [Fri, 22 Mar 2019 05:19:19 +0000 (10:49 +0530)]
LU-6142 ldlm: Fix style issues for ldlm_request.c

This patch fixes issues reported by checkpatch
for file lustre/ldlm/ldlm_request.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ifd675677bae2279cc4a541d81a14d8ffbed64268
Reviewed-on: https://review.whamcloud.com/34547
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
5 years agoLU-6142 ldlm: Fix style issues for ldlm_lockd.c 44/34544/4
Arshad Hussain [Thu, 21 Mar 2019 15:25:29 +0000 (20:55 +0530)]
LU-6142 ldlm: Fix style issues for ldlm_lockd.c

This patch fixes issues reported by checkpatch
for file lustre/ldlm/ldlm_lockd.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ife51cc65388966da8b25bb2e40ed30c3144c2e8e
Reviewed-on: https://review.whamcloud.com/34544
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
5 years agoRevert "LU-11771 ldlm: use hrtimer for recovery to fix timeout messages" 29/34629/2
James Nunez [Wed, 10 Apr 2019 14:55:20 +0000 (14:55 +0000)]
Revert "LU-11771 ldlm: use hrtimer for recovery to fix timeout messages"

This reverts commit 1ba794f6ec9e7ce7ad65fd74f170089fffc31d91.

We've seen several new or an increase in test failures for:
LU-12175 for sanity test 208 failures
LU-11560 for recovery-small test 134 failures
LU-12176 is for recovery-small test 107 failures
LU-12177 is for sanity test 160a failures

We are trying to pinpoint what patch landed that is causing these failures.

Test-Parameters: trivial testgroup=review-dne-part-1
Test-Parameters: testgroup=review-dne-zfs-part-1
Test-Parameters: testgroup=review-dne-part-1
Test-Parameters: testgroup=review-dne-zfs-part-1
Test-Parameters: testgroup=review-dne-part-1
Test-Parameters: testgroup=review-dne-zfs-part-1
Test-Parameters: testgroup=review-dne-part-1
Test-Parameters: testgroup=review-dne-zfs-part-1

Change-Id: I2ed33fb14726e29cae8745d671d4e25e276a7a66
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34629
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8955 tests: exercise SELinux policy info 99/33699/19
Sebastien Buisson [Wed, 21 Nov 2018 13:10:22 +0000 (22:10 +0900)]
LU-8955 tests: exercise SELinux policy info

Add new tests 21a and 21b to sanity-selinux.sh. Goal is to test
that SELinux policy info is properly sent by the client, and
checked by the server, in the following cases:
- connection
- create
- open
- unlink
- rename
- getxattr
- setxattr
- setattr
- getattr
- symlink
- hardlink

Test-Parameters: trivial testlist=sanity-selinux clientselinux
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibd9c5added027e12d8126641c56f21fdbc791941
Reviewed-on: https://review.whamcloud.com/33699
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12133 tests: sanityn test_35 syntax error 42/34542/2
Elena Gryaznova [Fri, 29 Mar 2019 14:14:33 +0000 (17:14 +0300)]
LU-12133 tests: sanityn test_35 syntax error

Patch fixes test_35() trivial syntax error.

Test-Parameters: trivial envdefinitions=ONLY=35 testlist=sanityn
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5882
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Change-Id: Id81b9f071920a2111314c869fe2700e6ddf5981a
Reviewed-on: https://review.whamcloud.com/34542
Tested-by: Jenkins
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11585 quota: no IS_ERR() check in qsd_lqe_read 22/34522/2
Alexander Zarochentsev [Wed, 27 Mar 2019 16:39:03 +0000 (19:39 +0300)]
LU-11585 quota: no IS_ERR() check in qsd_lqe_read

qsd_lqe_read() should check lqe_locate() return value with
IS_ERR() instead of != NULL.

Cray-bug-id: LUS-6636
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Change-Id: I930a16a789ece6ca52ca82ce69626d6678472c9a
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/34522
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12072 lov: remove KEY_CACHE_SET to simplify the code 19/34419/4
Yang Sheng [Thu, 7 Mar 2019 19:35:12 +0000 (11:35 -0800)]
LU-12072 lov: remove KEY_CACHE_SET to simplify the code

We must invoke obd_set_info_async with KEY_CACHE_SET after
obd_connect for OSC device. In fact, It can be combined
in obd_connect to simplify the code.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I8cf235658a3e4af1685a7454ebfe887e5a28eccc
Reviewed-on: https://review.whamcloud.com/34419
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9625 utils: remove old lfs "cp" and "ls" sub-commands 40/34240/2
Andreas Dilger [Wed, 13 Feb 2019 07:54:42 +0000 (00:54 -0700)]
LU-9625 utils: remove old lfs "cp" and "ls" sub-commands

Remove the obsolete "lfs cp" and "lfs ls" sub-commands for handling
"remote" users in a different namespace.  They have been non-working
since commit v2_8_54_0-73-g9d06de3 and were never in use before that.

Instead we have nodemap to handle UID/GID mapping.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8db90d388f8fa621d61fc65ab677b1589b3ebbe5
Reviewed-on: https://review.whamcloud.com/34240
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nangelinas@cray.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11566 utils: improve usage/docs for lctl llog commands 04/34004/11
Andreas Dilger [Thu, 10 Jan 2019 01:06:26 +0000 (18:06 -0700)]
LU-11566 utils: improve usage/docs for lctl llog commands

Improve the usage message and man pages for the llog_print,
llog_info, llog_catlist, and llog_cancel sub-commands.  Move
them out of the obsolete section of the lctl usage message.

Reorder some man pages to be in alphabetical order in Makefile.

Add named options to the various commands to make them
consistent with each other.  The --catalog option is not
required for ease of use, but is kept for compatibility
with the previous llog_cancel interface.

The llog_cancel --log_id option is removed from the usage
message and man page, since the ability to cancel individual
records from MDT recovery logs is currently not implemented
(no IOC_LLOG_CANCEL handler in mdt_iocontrol()).  There is
also no IOC_LLOG_PRINT method for the MDS either, so these
commands are mostly useful only for the MGS configuration
logs at this time.

Add a test case for llog_cancel.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I66c95c289f161370896e6764da4fc2f5803ebbe5
Reviewed-on: https://review.whamcloud.com/34004
Tested-by: Jenkins
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11636 tests: fix test_mkdir() to work with old severs 11/33611/11
Elena Gryaznova [Mon, 11 Feb 2019 13:52:35 +0000 (16:52 +0300)]
LU-11636 tests: fix test_mkdir() to work with old severs

Patch fixes the following test_mkdir() defects:
- test_mkdir() always creates striped dir, this breaks interop
  testing with non DNEII servers. For old servers stripe count
  is supposed to be set to 1. The tests which call
  test_mkdir() -c <value>, where <value> is greater than 1 should
  take care about interop with old servers and be skipped for
  such servers.
- test_mkdir() creates the striped dir with -c2 only, this limits
  the testing on MDSCOUNT > 2 config.
Patch adds the possibilities:
- to specify the exact stripe count if DIRSTRIPE_COUNT set;
  default is random and does not depend on the test number now.
- to specify the exact stripe index if DIRSTRIPE_INDEX set;
  default is not changed (random).
- set stripe count to 1 for servers without DNE2 (< 2.8.0).

Patch moves get_lustre_env call (added by LU-11607) from sanity.sh,
conf-sanity.sh and sanity-quota.sh to init_logging(). This allows
$MDS1_VERSION to be used in test_mkdir() instead of having to add
get_lustre_env() to every test script test_mkdir() is used in.

Patch does fix sanity 120a(), 120f() to operate with predictable
MDT index. We did not see this failure before test_mkdir()
improvements because tests were never run with mdt_index other
than equal to 0: the randomization
  mdt_index=$((test_num % MDSCOUNT))
  had set mdt_index equal to 0 for MDSCOUNT {1..6} for these tests.

Patch fixes sanity-scrub.sh:scrub_prep() to create $nfiles on mds$n
and sanity-scrub.sh:test_11() to create $CREATED files on each mds.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-6434, LUS-5500, LUS-6697
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Change-Id: I3e6362a0b7c1d3987a289e492c8e9ad090f394d4
Reviewed-on: https://review.whamcloud.com/33611
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11771 ldlm: use hrtimer for recovery to fix timeout messages 83/33883/12
James Simmons [Mon, 25 Mar 2019 15:42:17 +0000 (11:42 -0400)]
LU-11771 ldlm: use hrtimer for recovery to fix timeout messages

Currently the functions target_handle_connect/reconnect show
incorrect timeout to the end of recovery:

fs1-OST0000: Recovery already passed deadline 71578:57.
If you do not want to wait more, please abort the recovery by force.
...
fs1-OST0000: Denying connection for new client ...
(1 recovered, 11 in progress, and 1 evicted) to recover in 71578:57

This is due to the assumption that the time returned by the
monotonic clock and jiffies was initialized at the same time but
that is not the case. So a compare between ktime_get_seconds()
and jiffies converted to seconds is invalid.

We solve this by replacing the recovery timer with a hrtimer based
one. Their are many benefits to using a hrtimer over jiffies like
better scaling, power profile, and better handling on tickless
system. This also makes the code clear by using just the real wall
clock in all cases.

Change-Id: I50442605686382f7afb9a1f49eb336c0ee637cdc
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33883
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12068 test: compare position for ZFS dot entry 25/34525/2
Hongchao Zhang [Wed, 27 Mar 2019 17:08:03 +0000 (13:08 -0400)]
LU-12068 test: compare position for ZFS dot entry

in test_6b of sanity-lfsck.sh, the position will be zero for
special entries "." and "..", which should not be used to
determine whether the LFSCK process is forward or not, in this
case, the otable position should be used.

Change-Id: I98aee1ae92fa5ea742a8001b58e092111d646477
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34525
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8760 lfsck: fix bit operations lfsck_assistant_data 02/34502/2
Alexander Boyko [Tue, 26 Mar 2019 11:43:14 +0000 (07:43 -0400)]
LU-8760 lfsck: fix bit operations lfsck_assistant_data

Race between lfsck_master_engine->lfsck_double_scan_generic()
and  lfsck_layout->lfsck_assistant_engine() take a place.
Both threads were sleeping and waiting for each other.
lad_to_double_scan was set before sleep, but lfsck_assistant_data had
zeros
  lad_post_result = 1,
  lad_to_post = 0,
  lad_to_double_scan = 0,
  lad_in_double_scan = 0,
  lad_exit = 0,
  lad_incomplete = 0,

Using
 unsigned int a:1,
           b:1;
  f1() {a = 1;}
  f2() {b = 0;}
is racy for multithread execution. These type of logic requires
atomic bit operations.
The race is lad_to_double_scan vs lad_to_post.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-7076
Change-Id: I4f971ce2acb244f32ae2e108b96995dc2f27e7a3
Reviewed-on: https://review.whamcloud.com/34502
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-6142 ldlm: Fix style issues for ldlm_pool.c 97/34497/3
Arshad Hussain [Thu, 21 Mar 2019 07:17:30 +0000 (12:47 +0530)]
LU-6142 ldlm: Fix style issues for ldlm_pool.c

This patch fixes issues reported by checkpatch
for file lustre/ldlm/ldlm_pool.c

Change-Id: Iee850badeced8ad4edcb4a75bfd2daca0f508c2a
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/34497
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
5 years agoLU-12069 mdt: add missing argument to enable compilation 20/34420/5
Alex Zhuravlev [Thu, 14 Mar 2019 12:59:09 +0000 (15:59 +0300)]
LU-12069 mdt: add missing argument to enable compilation

with gcc8 which is very strict about missing arguments.

Change-Id: I08cd4a876c4cb49e15ada7458a637c98ea4d83c0
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34420
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10073 tests: stop running smoke test 43/34543/4
James Nunez [Fri, 29 Mar 2019 19:01:54 +0000 (13:01 -0600)]
LU-10073 tests: stop running smoke test

lnet-selftest test smoke is failing at a high rate
when tested with ARM clients and when run with Ubuntu
clients. Stop running this test for ARM and Ubuntu
clients until we find a solution.

Test-Parameters: trivial clientarch=aarch64 testlist=lnet-selftest
Test-Parameters: clientdistro=ubuntu1804 testlist=lnet-selftest
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I5c59b3a5dd42c9b6afcf5e0d1ce17e49efc1b44a
Reviewed-on: https://review.whamcloud.com/34543
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10602 llite: add file heat support 99/34399/10
Li Xi [Mon, 5 Feb 2018 03:57:54 +0000 (22:57 -0500)]
LU-10602 llite: add file heat support

File heat is a special attribute fo files/objects which reflects
the access frequency of the files/objects.
File heat is mainly desinged for cache management. Caches like
PCC can use file heat to determine which files to be removed from
the cache or which files to fetch into cache.
This patch adds file heat support on llite level.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I168fc657f0c859311e5114191b60047646909be0
Reviewed-on: https://review.whamcloud.com/34399
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9859 libcfs: remove wi_data from cfs_workitem 66/34466/2
NeilBrown [Tue, 19 Mar 2019 18:12:53 +0000 (14:12 -0400)]
LU-9859 libcfs: remove wi_data from cfs_workitem

In every case, the value passed via wi_data can be determined
from the cfs_workitem pointer using container_of().

So use container_of(), and discard wi_data.

Linux-commit: 19ae89d32503493315dec77919815d3add851389

Change-Id: Iefc4b6ebf40b48bd60a5820de05eb44746a041c0
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/34466
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
5 years agoLU-12081 mdt: rename shouldn't PDO lock if parent is remote 40/34440/2
Lai Siyao [Tue, 5 Mar 2019 04:30:15 +0000 (12:30 +0800)]
LU-12081 mdt: rename shouldn't PDO lock if parent is remote

In rename parent locking, if target parent is source parent, but
it's remote, rename shouldn't PDO lock on it because PDO lock is
for local lock.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib6ee1f70a50ddec3182c04c38a10ebbf2c384ccd
Reviewed-on: https://review.whamcloud.com/34440
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12059 build: Update ZFS version to 0.7.13 93/34393/2
Nathaniel Clark [Mon, 11 Mar 2019 17:46:20 +0000 (13:46 -0400)]
LU-12059 build: Update ZFS version to 0.7.13

ZFS
 * test-runner: python3 support #8096
 * Fix flake 8 style warnings #7925 #7952
 * GCC 9.0: Fix ztest "directive argument is not a nul-terminated
   string" #8330
 * Linux 5.0 compat: Fix bio_set_dev() #8287
 * Linux 5.0 compat: Disable vector instructions on 5.0+ kernels
   #8259
 * Linux 5.0 compat: Fix SUBDIRs #8257
 * Linux 5.0 compat: Convert MS_* macros to SB_* #8264
 * Linux 5.0 compat: Use totalram_pages() #8263
 * Linux 5.0 compat: access_ok() drops 'type' parameter #8261
 * deadlock between mm_sem and tx assign in zfs_write() and page
   fault #7939
 * dkms: Enable debuginfo option to be set with zfs sysconfig file
   #8304
 * Bump commit subject length to 72 characters #8250
 * zfs.8 uses wrong snapshot names in Example 15 #8241
 * Add enclosure_symlinks option to vdev_id #8194
 * vdev_id: new slot type ses #6956
 * vdev_id: extension for new scsi topology #6592
 * Rename macro ZFS_MINOR due to Lustre conflict #8195
 * Add kernel module auto-loading #7287
 * Use autoconf variable for C preprocessor #8180
 * OpenZFS 9577 - remove zfs_dbuf_evict_key tsd #7602
 * Honor --with-mounthelperdir where applicable #6962
 * contrib/initramfs: switch to automake #6761

SPL
 * Linux 4.20 compat: Fix current_kernel_time() #8258
 * Linux 5.0 compat: Fix timespec_sub() tonyhutter/spl@a333b28
 * Linux 5.0 compat: Fix SUBDIRs #8257
 * Linux 5.0 compat: Use totalram_pages() #8263
 * deadlock between mm_sem and tx assign in zfs_write() and page
   fault #7939
 * Linux 4.18 compat: Use ktime_get_coarse_real_ts64() #8258
 * Use autoconf variable for C preprocessor #8180

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I73fa5c2a9ddcf19683229d6fb9e61ff0835639ff
Reviewed-on: https://review.whamcloud.com/34393
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-8066 llite: don't use class_setup_tunables() 92/34292/8
James Simmons [Wed, 13 Mar 2019 14:03:28 +0000 (10:03 -0400)]
LU-8066 llite: don't use class_setup_tunables()

llite is very different from the other types of lustre devices.
Since this is the case llite should register independently. Doing
this allows us to cleanup the debugfs registering in the release
function of struct kobj_type. Also fix the improper handling to
linking it to the lustre_kset. Use kset_get() and kset_put() to
properly keep the reference count for the lustre_kset. This
lastly provides flexibility for making changes to
class_setup_tunables().

Change-Id: I676159caa7885485fdd57c908123214f68514227
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34292
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-6202 misc: remove LIBCFS_IOC_DEBUG_MASK ioctl 92/33692/5
Andreas Dilger [Tue, 20 Nov 2018 21:27:54 +0000 (14:27 -0700)]
LU-6202 misc: remove LIBCFS_IOC_DEBUG_MASK ioctl

Remove the LIBCFS_IOC_DEBUG_MASK ioctl, since the debug and subsystem
mask can be modified via /proc for a long time, and tools have not
used this ioctl since 2.6.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Idfcf85b2d4317bb4baac7ee9af55158f7b3ebbe5
Reviewed-on: https://review.whamcloud.com/33692
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-6202 misc: delete OBD_IOC_PING_TARGET ioctl 91/33691/5
Andreas Dilger [Tue, 20 Nov 2018 21:24:22 +0000 (14:24 -0700)]
LU-6202 misc: delete OBD_IOC_PING_TARGET ioctl

The OBD_IOC_PING_TARGET ioctl was removed from tool usage in
Lustre v2_5_60_0-27-g122aadd and replaced with a /proc interface.
It is no longer needed and can be removed.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7e4881edc50526e28b1a1aa039a4c986593ebbe5
Reviewed-on: https://review.whamcloud.com/33691
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8130 nrs: convert NRS CRR to rhashtable 85/33485/6
James Simmons [Wed, 27 Feb 2019 04:17:47 +0000 (23:17 -0500)]
LU-8130 nrs: convert NRS CRR to rhashtable

Move away for the cfs hash implementation to rhashtable
for NRS CRR handling. Since rhashtable is lockless it
should also increase performance.

Test-Parameters:trivial testlist=sanityn envdefinitions=ONLY=77b

Change-Id: I817d35c8e36d7cb3397ffe8d00eee225245559b8
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33485
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nangelinas@cray.com>
5 years agoLU-8066 obd: make health_check sysfs compliant 31/25631/7
James Simmons [Fri, 15 Mar 2019 18:10:42 +0000 (14:10 -0400)]
LU-8066 obd: make health_check sysfs compliant

The patch http://review.whamcloud.com/16721 was
ported to the upstream client but was rejected
since it violating the sysfs one item rule. Change
the reporting of LBUG plus unhealthy to just
reporting LBUG. Move the reporting of which device
is unhealthy to a new debugfs file that mirrors
the sysfs file.

Change-Id: Ie1640399e97902272000313bb7ccdcbd2be6daf6
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25631
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8955 ptlrpc: manage SELinux policy info for metadata ops 24/24424/38
Sebastien Buisson [Tue, 16 Aug 2016 08:17:40 +0000 (17:17 +0900)]
LU-8955 ptlrpc: manage SELinux policy info for metadata ops

Add SELinux policy info for following metedata operations:
- create
- open
- unlink
- rename
- getxattr
- setxattr
- setattr
- getattr
- symlink
- hardlink

On server side, get SELinux policy info from nodemap and compare
it with the one received from client.

Test-Parameters: serverbuildno=62488 serverjob=lustre-reviews testlist=sanity,sanity-selinux clientselinux
Test-Parameters: clientbuildno=4033 clientjob=lustre-reviews-patchless testlist=sanity,sanity-selinux clientselinux
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I16493d7c5713180fb065623b735d7348fc3f9140
Reviewed-on: https://review.whamcloud.com/24424
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8955 ptlrpc: manage SELinux policy info at connect time 22/24422/34
Sebastien Buisson [Tue, 16 Aug 2016 12:53:03 +0000 (21:53 +0900)]
LU-8955 ptlrpc: manage SELinux policy info at connect time

At connect time, compute SELinux policy info on client side, and
send it over the wire.
On server side, get SELinux policy info from nodemap and compare
it with the one received from client.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9b4a206455f2c0b451f6b3ed7e3a85285592758e
Reviewed-on: https://review.whamcloud.com/24422
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-6142 utils: Fix style issues for l_getidentity.c 37/34437/3
Arshad Hussain [Sat, 9 Mar 2019 16:35:49 +0000 (22:05 +0530)]
LU-6142 utils: Fix style issues for l_getidentity.c

This patch fixes issues reported by checkpatch
for file lustre/utils/l_getidentity.c

Change-Id: If49272725e663c1e3ddb75acd0eadc58b79be35a
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/34437
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-6142 utils: Fix style issues for create_iam.c 36/34436/2
Arshad Hussain [Sat, 9 Mar 2019 14:27:54 +0000 (19:57 +0530)]
LU-6142 utils: Fix style issues for create_iam.c

This patch fixes issues reported by checkpatch for
file lustre/utils/create_iam.c

Test-parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ibb221c3e7fa53d6d3b87027e8604426bf416211c
Reviewed-on: https://review.whamcloud.com/34436
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-11798 grant: prevent overflow of o_undirty 48/33948/10
Alex Zhuravlev [Wed, 2 Jan 2019 18:11:29 +0000 (10:11 -0800)]
LU-11798 grant: prevent overflow of o_undirty

tgt_grant_inflate() returns a u64, and if tgd_blockbits and val are
large enough, can return a value >= 2^32.  tgt_grant_incoming()
assigns oa->o_undirty the returned value.  Since o_undirty is u32, it
can overflow.

This occurs with Lustre clients < 2.10 and a ZFS backend when the zfs
"recordsize" > 128k (the default).

In tgt_grant_inflate(), check the returned value and prevent o_undirty
from being assigned a value greater than 2^30.

Change-Id: I75b9065a524238df2d582e640418fdfa2f1e9a72
Signed-off-by: Alexey Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/33948
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11742 test: have libtool execute the test binaries 47/33947/6
James Simmons [Fri, 15 Mar 2019 00:38:31 +0000 (20:38 -0400)]
LU-11742 test: have libtool execute the test binaries

With the move to libtools the ability to run all the lustre
utilities form the source tree was lost. To work around this
the libtool -no-install flag was used to prevent the creation
of the libtool wrappers. While this worked to restore the
source tree sand box development new package breakage is showing.
This is due to the rpath being hard coded into the utilies when
-no-install is used and some platforms disable fixed rpaths.

A very similar problem exist for people who want to use gdb to
debug their projects application. gdb does not work on libtool
wrappers as well so the recommended approach to this type of
problem is to use the libtool execute command. This command
allows the execution of an external non project binary, like
gdb, with the projects real binary application. Apply this
approach to the lustre test suite so commands like kill can
be used to shutdown lustre utilies that are not installed into
the testing environment.

Change-Id: I74112f7250f1c43313d868c0edc7c8815d373002
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33947
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12037 mdt: call mdt_dom_discard_data() after rename unlock 01/34401/2
Lai Siyao [Mon, 4 Mar 2019 12:20:14 +0000 (20:20 +0800)]
LU-12037 mdt: call mdt_dom_discard_data() after rename unlock

mdt_reint_rename() should drop all locks including global rename
lock, and then call mdt_dom_discard_data(), otherwise it may
cause deadlock.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I72cad3ee589c98c54f1e1281c39faa8779e562e8
Reviewed-on: https://review.whamcloud.com/34401
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9193 security: return security context for metadata ops 31/26831/20
Bruno Faccini [Wed, 26 Apr 2017 10:35:28 +0000 (12:35 +0200)]
LU-9193 security: return security context for metadata ops

Security layer needs to fetch security context of files/dirs
upon metadata ops like lookup, getattr, open, truncate, and
layout, for its own purpose and control checks.
Retrieving the security context consists in a getxattr operation
at the file system level. The fact that the requested metadata
operation and the getxattr are not atomic can create a window
for a dead-lock situation where, based on some access patterns,
all MDT service threads can become stuck waiting for lookup lock
to be released and thus unable to serve getxattr for security context.
Another problem is that sending an additional getxattr request for
every metadata op hurts performance.

This patch introduces a way to get atomicity by having
the MDT return security context upon granted lock reply,
sparing the client an additional getxattr request.

Test-Parameters: serverbuildno=62488 serverjob=lustre-reviews testlist=sanity,sanity-selinux clientselinux
Test-Parameters: clientbuildno=4033 clientjob=lustre-reviews-patchless testlist=sanity,sanity-selinux clientselinux
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iaaf4d93f8d3bf31b5a2c23e7db36b3cb3feb31ba
Reviewed-on: https://review.whamcloud.com/26831
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11596 tests: skip sanity test_64d for ARM 32/34432/2
Andreas Dilger [Fri, 15 Mar 2019 21:21:58 +0000 (15:21 -0600)]
LU-11596 tests: skip sanity test_64d for ARM

Add sanity.sh test_64d to the ALWAYS_EXCEPT list for this bug for
ARM, since it is also intermittently failing.

Test-Parameters: trivial clientarch=aarch64 testlist=sanity
Test-Parameters: testgroup=review-dne-part-1 testlist=sanity
Test-Parameters: testgroup=review-dne-zfs-part-1 testlist=sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifd52aa33d9bbf27303341ff0314322765b3ebbe5
Reviewed-on: https://review.whamcloud.com/34432
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12068 tests: add debug for sanity-lfsck test_6b 17/34417/4
Andreas Dilger [Thu, 14 Mar 2019 07:54:41 +0000 (01:54 -0600)]
LU-12068 tests: add debug for sanity-lfsck test_6b

Dump the lfsck_namespace stats file on error to see if it provides
any more information about why the test is failing.

Test-Parameters: trivial testgroup=review-dne-zfs-part-2 testlist=sanity-lfsck
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If93202101f33f013b9f5ef56022f76c86b3ebbe5
Reviewed-on: https://review.whamcloud.com/34417
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12065 lnd: increase CQ entries 73/34473/3
Amir Shehata [Wed, 20 Mar 2019 18:10:34 +0000 (11:10 -0700)]
LU-12065 lnd: increase CQ entries

Several sites have reported RDMA timeouts. Most of the timeouts
are occurring for transmits on the active_tx queue. Transmits are
placed on the active_tx queue until a completion is received. If
there isn't enough CQ entries available, it's possible for a
completions events to be delayed, causing these timeouts.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9edad734b5860ce20af4977b4c1cdc07f25f078e
Reviewed-on: https://review.whamcloud.com/34473
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-10496 tgt: move FMD handling from OFD to target 90/34190/7
Mikhail Pershin [Fri, 1 Feb 2019 12:13:38 +0000 (15:13 +0300)]
LU-10496 tgt: move FMD handling from OFD to target

- move ofd/ofd_fmd.c to target/tgt_fmd.c with corresponding
  changes
- add FMD calls to the MDT for Data-on-MDT files
- per-target tunable parameters init/fini
- update related tests to be correctly used with DOM
- make sanity.sh test_36 to work again
- remove target_handle_ping() along with o_ping method in
  obd operations because it is not used anymore. Ping is
  fully handled in tgt_obd_ping()

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I24280a2a9610d05eb9655c73bb067f94ff251980
Reviewed-on: https://review.whamcloud.com/34190
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
5 years agoLU-10929 tests: skip sanity/315 if IO accounting is disabled 72/32072/7
Alex Zhuravlev [Thu, 19 Apr 2018 13:23:03 +0000 (16:23 +0300)]
LU-10929 tests: skip sanity/315 if IO accounting is disabled

just to avoid false fails

Test-Parameters: trivial

Change-Id: I1f7820f03fd3487aa2bd6187362d06934b12ad8e
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/32072
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-12044 ptlrpc: Translate HABEO_ macros 75/34375/5
Ben Evans [Tue, 5 Mar 2019 22:28:11 +0000 (17:28 -0500)]
LU-12044 ptlrpc: Translate HABEO_ macros

HABEO_CORPUS -> HAS_BODY
HABEO_CLAVIS -> HAS_KEY
HABEO_REFERO -> HAS_REPLY
MUTABOR -> IS_MUTABLE

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I341cb899aeebd9cccf6ee8111112016c5e6dee53
Reviewed-on: https://review.whamcloud.com/34375
Reviewed-by: Ann Koehler <amk@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12030 tests: Properly detect debug kernel use on rhel7.6 42/34342/4
Oleg Drokin [Thu, 28 Feb 2019 04:40:37 +0000 (23:40 -0500)]
LU-12030 tests: Properly detect debug kernel use on rhel7.6

kmalloc-128 slab seems to be gone, so let's use dma-kmalloc-128
instead.

Change-Id: Ice7f350ba2bc6cc733c0a98b0037e6f0980216c9
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34342
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-11926 ldlm: Lost lease lock on migrate error 82/34182/4
Andriy Skulysh [Tue, 4 Dec 2018 13:27:58 +0000 (15:27 +0200)]
LU-11926 ldlm: Lost lease lock on migrate error

All the file operations have the following locking order - parent,
child. If a lock for a child is returned to the client, the following
operations on this file are done by the child fid.

However, the migrate is an exception - it takes the lease lock first and
takes the PW parent lock next during the MDS_REINT.

At the same time, if there is a parallel racing operation (open) which
has taken a lock on parent (conflicting with the next MDS_REINT) and
is trying to take a lock on child - it is blocked until
the lease cancel comes.

The lease cancel is piggy-backed on the MDS_REINT RPC and is handled
at the end of the operation, trying to take the conflicting parent lock
first - thus a deadlock occurs.

At the same time, the lease lock is not supposed to block anything, it
is just an indicator on the server there is no other conflicting
operation has occurred during the migration - thus
set LDLM_FL_CANCEL_ON_BLOCK on it and the conflicting operation
will not be blocked.

In this case, the MDS_REINT will return -EAGAIN as the lease
is cancelled and the client will retry its migration.

Change-Id: Ib6cdc24ffe4ecb99d314a5466bcbb066a1d04dc1
Cray-bug-id: LUS-6811
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/34182
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11902 build: Remove obsolete lustre-utils files and entries 37/34137/2
Thomas Stibor [Wed, 30 Jan 2019 13:47:46 +0000 (14:47 +0100)]
LU-11902 build: Remove obsolete lustre-utils files and entries

DEB packages can be build for server or client, where naming
convention for server is:
* lustre-server-modules-*
* lustre-server-utils-*
and for client:
* lustre-client-modules-*
* lustre-client-utils-*
Previously the util package was named lustre-utils which
is now obsolete due to the client and server separation
and thus can be removed.

Test-Parameters: clientdistro=ubuntu1604 trivial
Signed-off-by: Thomas Stibor <t.stibor@gsi.de>
Change-Id: Id0a945d5b76e5c35cf858e3ab224efa342cde28d
Reviewed-on: https://review.whamcloud.com/34137
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11898 libcfs: do not calculate debug_mb if it is set 28/34128/2
Vladimir Saveliev [Tue, 29 Jan 2019 03:30:28 +0000 (06:30 +0300)]
LU-11898 libcfs: do not calculate debug_mb if it is set

debug_mb is libcfs module parameter. It should be possible to set it
via

modprobe libcfs libcfs_debug_mb=800

or via adding

options libcfs libcfs_debug_mb=800

to modules configuration.

Fixes: 7092309f32 ("LU-8066 libcfs: migrate to debugfs")
Test-Parameters: trivial
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-6936
Change-Id: I9da51e44a938a312e43b8a0781b49efc197f7ca9
Reviewed-on: https://review.whamcloud.com/34128
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
5 years agoLU-11885 test: reset quota upon test failed 08/34108/3
Wang Shilong [Fri, 25 Jan 2019 02:05:41 +0000 (10:05 +0800)]
LU-11885 test: reset quota upon test failed

Currently if test fail, EXIT will be trapped and
cleanup_quota_test() will be called to cleanup dirs.

However, we didn't reset quota in this case, this might
make following test fail because of this.

Also cleanup duplicated lines sine we have done reseting
inside cleanup_quota_test().

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ic70f59cdc1181473721e4f87b806b8203857fca8
Reviewed-on: https://review.whamcloud.com/34108
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9010 ptlrpc: Change static defines to use macro for sec_gc.c 37/33937/4
Arshad Hussain [Fri, 28 Dec 2018 04:07:39 +0000 (23:07 -0500)]
LU-9010 ptlrpc: Change static defines to use macro for sec_gc.c

This patch replaces all mutex, locks, and wait qeueues
which are defined statically in file lustre/ptlrpc/sec_gc.c
with kernel provided macro.

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ifc978e6d7b83a9d41078a98a0bcf5a2606ea7360
Reviewed-on: https://review.whamcloud.com/33937
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11821 lfsck: do not compare negatives to sizeof() 08/33908/7
Alex Zhuravlev [Fri, 21 Dec 2018 12:03:07 +0000 (15:03 +0300)]
LU-11821 lfsck: do not compare negatives to sizeof()

as sizeof() is unsigned.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib8cf8026a4e18ac9704a0294eaf7c57ecf678e03
Reviewed-on: https://review.whamcloud.com/33908
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
5 years agoLU-11752 osc: pass client page size during reconnect too 47/33847/8
Mikhail Pershin [Thu, 13 Dec 2018 10:11:05 +0000 (13:11 +0300)]
LU-11752 osc: pass client page size during reconnect too

Client page size is reported to the server in ocd_grant_blkbits
and server returns back device blocksize. During reconnect that
ocd_grant_blkbits contains server device blocksize which is used
by server as client page size wrongly.

Patch sets ocd_grant_blkbits to the client page size again during
reconnect so server will get expected information.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I14bba1d025e4e9fb99fd4bae4002463439ac265c
Reviewed-on: https://review.whamcloud.com/33847
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11678 quota: protect quota flags at OSC 47/33747/3
Hongchao Zhang [Tue, 22 Jan 2019 08:39:21 +0000 (16:39 +0800)]
LU-11678 quota: protect quota flags at OSC

There is no protection in OSC quota hash tracking the quota flags of
different qid, which could cause the previous request to modify the
quota flags which was set by the current request because the replies
could be out of order.

This patch also adds a lock to protect the operations on the quota
hash from different requests.

Test-Parameters: testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota

Change-Id: Ia6e5141265beacb9401dd533081fa0b85fd5ea6a
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33747
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10258 lfs: lfs mirror copy command 20/33220/6
Bobi Jam [Sat, 22 Sep 2018 16:42:37 +0000 (00:42 +0800)]
LU-10258 lfs: lfs mirror copy command

Add "lfs mirror copy" command to copy a mirror's content to other
mirror(s) of a mirrored file.

Usage:

lfs mirror copy {--read-mirror|-i <id0>}
{--write-mirror|-o <id1>[,<id2>,...]} <mirrored_file>

Options:

--read-mirror|-i <id0>
  This option indicates the content of which mirror specified by id0
  needs to be read. The id0 is the numerical unique identifier for a
  mirror.

--write-mirror|-o <id1>[,<id2>,...]
  This option indicates the content of which mirror(s) specified by
  mirror IDs needs to be written. The mirror IDs are separated with
  comma.  If the mirror id -1 is used here, it means that all mirrors
  other than the read mirror are to be written.

Note:

Be ware that the written mirror(s) will be marked as non-stale
mirror(s), be careful that after using this command, you could get a
file with non-stale mirrors while containing different contents.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id138368cdb29ec14b7c03a5db3b2dd1e0db5ea37
Reviewed-on: https://review.whamcloud.com/33220
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11409 osc: grant shrink shouldn't account skipped OSC 06/33206/3
Alex Zhuravlev [Thu, 20 Sep 2018 14:15:42 +0000 (17:15 +0300)]
LU-11409 osc: grant shrink shouldn't account skipped OSC

otherwise only the first 100 OSCs are subject to grant shrink procedure.

Change-Id: I65ed247b91422effb8f278d1991d4a5ba1c24814
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33206
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
5 years agoLU-11408 osc: propagate grant shrink interval immediately 04/33204/2
Alex Zhuravlev [Thu, 20 Sep 2018 07:47:03 +0000 (10:47 +0300)]
LU-11408 osc: propagate grant shrink interval immediately

currently the new interval (updated with lctl) will be used
only when the next shrink happens. with default interval it
will take at least 20 minutes. instead we should refresh it
immediately.

Change-Id: Id22824e48fbc50c1f464316ab5b574d1189bb0c5
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33204
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8384 scripts: Add scripts to systemd for EL7 57/21457/6
Dmitry Eremin [Fri, 8 Jul 2016 21:15:37 +0000 (00:15 +0300)]
LU-8384 scripts: Add scripts to systemd for EL7

When rebooting a lustre client where Lustre filesystem is still
mounted, the shutdown hangs. This patch create a systemd service
that unmount the Lustre filesystems and unload the Lustre modules
when system is shutdown.

Test-Parameters: trivial
Change-Id: I1cfe84684e23b8861743241dfbc4d6e320ace4a6
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Gregoire Pichon <gregoire.pichon@atos.net>
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/21457
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8047 llite: optimizations for not granted lock processing 65/19665/13
Andrew Perepechko [Thu, 7 Mar 2019 20:18:45 +0000 (12:18 -0800)]
LU-8047 llite: optimizations for not granted lock processing

This patch removes ll_md_blocking_ast() processing for
not granted locks. The reason is ll_invalidate_negative_children()
can slow down I/O significantly without a reason if there
are thousands or millions of files in the directory
cache.

Change-Id: Ic69c5f02f71c14db4b9609677d102dd2993f4feb
Seagate-bug-id: MRP-3409
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/19665
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-6836 test: re-add test 4a to sanity-quota for ZFS 43/34143/4
Hongchao Zhang [Thu, 24 Jan 2019 19:45:05 +0000 (14:45 -0500)]
LU-6836 test: re-add test 4a to sanity-quota for ZFS

The ZFS sync performance has been improved, it's time to add test
4a back into sanity-quota for ZFS, and also increase the grace time
a little for ZFS.

Test-Parameters: trivial
Test-Parameters: envdefinitions=ONLY=4a fstype=zfs testlist=sanity-quota,sanity-quota,sanity-quota

Change-Id: I32dd76686cdd289b49e36efff3abd6691e76ef57
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34143
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-11835 mdt: return DOM size on open resend 44/34044/5
Mikhail Pershin [Wed, 16 Jan 2019 13:24:58 +0000 (16:24 +0300)]
LU-11835 mdt: return DOM size on open resend

DOM size is returned along with DOM lock always, but it is
not true with open resend.

Patch fixes that issue and adds test case.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I73d43933f781f192e9aa8c6ee388a043dab5bde9
Reviewed-on: https://review.whamcloud.com/34044
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8260 osd-ldiskfs: osd_fiemap_get() fix address space mismatch 78/33878/6
Arshad Hussain [Tue, 4 Dec 2018 18:20:59 +0000 (23:50 +0530)]
LU-8260 osd-ldiskfs: osd_fiemap_get() fix address space mismatch

There was an address space mismatch in function
osd_fiemap_get() as this uses "__user" qualifier
in fiemap_extent buffer. Since this buffer is created
under kernel and again passed to another call, this
may fail under some configuration.

This patch address this issue by modifying the
address space limit by using get_fs() and set_fs()
call suggesting that the pointers are intact and
secure.

Change-Id: I25048faecd3475d5e91e25e6a47e065e49e36b26
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33878
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-6142 obdclass: Fix style issues for obd_config.c 82/33082/9
Arshad Hussain [Sat, 25 Aug 2018 23:59:42 +0000 (05:29 +0530)]
LU-6142 obdclass: Fix style issues for obd_config.c

This patch fixes issues reported by checkpatch
for file lustre/obdclass/obd_config.c

Change-Id: If97513fe594ee76c9e153c33d644cd94c48f82c0
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33082
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11999 dne: performance improvement for file creation 91/34291/4
Jinshan Xiong [Sun, 24 Feb 2019 22:32:41 +0000 (14:32 -0800)]
LU-11999 dne: performance improvement for file creation

This is to remove an obsoleted code where it causes drastic
performance degradation. This code is written before PERM lock
is introduced, and it requests UPDATE lock at path walk for
remote directory, which will be cancelled at later file creation.

Tests result before and after this patch is applied:

Test case:
rm -rf /mnt/lustre_purple/testdir
lfs mkdir -i 0 /mnt/lustre_purple/testdir
lfs mkdir -i 2 /mnt/lustre_purple/testdir/dir2
./lustre-release/lustre/tests/createmany -o \
/mnt/lustre_purple/testdir/dir2/f 10000

Before the patch is applied:
total: 10000 open/close in 12.82 seconds: 780.22 ops/second

After the patch is applied:
total: 10000 open/close in 4.89 seconds: 2044.75 ops/second

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Change-Id: Ib474dc28d6edc7d15801b6821edc0e1d108bb4b6
Reviewed-on: https://review.whamcloud.com/34291
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11964 mdc: prevent glimpse lock count grow 61/34261/8
Mikhail Pershin [Thu, 14 Feb 2019 21:51:00 +0000 (00:51 +0300)]
LU-11964 mdc: prevent glimpse lock count grow

DOM locks matching tries to ignore locks with
LDLM_FL_KMS_IGNORE flag during ldlm_lock_match() but
checks that after ldlm_lock_match() call. Therefore if
there is any lock with such flag in queue then all other
locks after it are ignored and new lock is created causing
big amount of locks on single resource in some access
patterns.
Patch extends lock_matches() function to check flags to
exclude and adds ldlm_lock_match_with_skip()p to use that
when needed.
Corresponding test was added in sanity-dom.sh

Test-Parameters: testlist=sanity-dom
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic45ca10f0e603e79a3a00e4fde13a5fae15ea5fc
Reviewed-on: https://review.whamcloud.com/34261
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10949 mdt: lost reference on mdt_md_root 81/34181/3
Andriy Skulysh [Wed, 20 Feb 2019 10:48:03 +0000 (12:48 +0200)]
LU-10949 mdt: lost reference on mdt_md_root

mdt_remote_object_lock_try() drops object
reference in case of an error but if the
request was sent to a server it is decreased
again via failed_lock_cleanup()

Add ldlm_created_callback. It is called after
lock creation, so we can safely add a reference
to l_ast_data and drop it only in BL AST handler.

Cray-bug-id: LUS-7013
Change-Id: Iaf98c620804f2de4528689e44e957a9fb0073162
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/34181
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>