Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-12345 ldiskfs: optimize nodelalloc mode 38/37538/5
Artem Blagodarenko [Tue, 28 May 2019 16:51:21 +0000 (19:51 +0300)]
LU-12345 ldiskfs: optimize nodelalloc mode

We found performance regression when using bigalloc with "nodelalloc"
(1MB cluster size):

1. mke2fs -C 1048576 -O ^has_journal,bigalloc /dev/sda
2. mount -o nodelalloc /dev/sda /test/
3. time dd if=/dev/zero of=/test/io bs=1048576 count=1024

The "dd" will cost about 2 seconds to finish, but if we mke2fs without
"bigalloc", "dd" will only cost less than 1 second.

The reason is: when using ext4 with "nodelalloc", it will call
ext4_find_delalloc_cluster() nearly everytime it call
ext4_ext_map_blocks(), and ext4_find_delalloc_range() will also scan
all pages in cluster because no buffer is "delayed".  A cluster has
256 pages (1MB cluster), so it will scan 256 * 256k pags when creating
a 1G file. That severely hurts the performance.

Therefore, we return immediately from ext4_find_delalloc_range() in
nodelalloc mode, since by definition there can't be any delalloc
pages.

The same optimization also added for ldiskfs_find_delayed_extent()
function that improve performance dromaticaly.

Here is results of testing on two node system.
Without the patch:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00   56.30    0.06    0.00   43.63

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sds               0.00     0.00    0.00 1174.00     0.00     4.59
8.00     0.84    0.71    0.00    0.71   0.01   1.20

With patch:
08/29/2018 01:13:22 AM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
0.00    0.00    4.13   30.37    0.00   65.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s      wMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm %util
sds               0.00     0.00    0.00 54117.82     0.00     211.43
8.00   152.59    2.82    0.00    2.82   0.02 99.01

Lustre-change: https://review.whamcloud.com/34982
Lustre-commit: af48ae8bff289b2bc083a888efeafa3c48df91e2

Cray-bug-id: LUS-5835
Signed-off-by: Artem Blagodarenko <c17828@cray.com>
Change-Id: Ie33410d4481778ee4f76a054ab8cfc11cc19a0ed
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37538
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11114 llite: Update mdc and lite stats on open|creat 58/38158/2
Olaf Faaland [Tue, 26 Nov 2019 23:20:11 +0000 (15:20 -0800)]
LU-11114 llite: Update mdc and lite stats on open|creat

Increment "create" counter in mdc/<instance>/md_stats, and
"mknod" counter in llite/<instance>stats when an open with
the CREAT flag results in a newly created file.

The mknod counter is chosen for consistency with
patch http://review.whamcloud.com/20246
 "LU-8150 mdt: Track open+create as mknod"
but the mdc counter set does not include mknod.

Lustre-change: https://review.whamcloud.com/36948
Lustre-commit: 4b8518ee4fa542f45fcdaeaec580d858dfcaf137

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: If082b911e415c0bc46248728e47ce0f37b9efa83
Reviewed-on: https://review.whamcloud.com/38158
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
4 years agoLU-12580 lov: fix out of bound usercopy 51/38051/2
Li Dongyang [Fri, 7 Feb 2020 12:16:26 +0000 (23:16 +1100)]
LU-12580 lov: fix out of bound usercopy

When handling ioctl LL_IOC_LOV_GETSTRIPE, the user
could pass a limited buffer which is bigger than
lov_comp_md_size(), it will crash the client because
we are doing the usercopy with the user provided buffer
size.

Make sure the copy works, also for the PFL file,
we should only copy the chosen component.

Lustre-change: https://review.whamcloud.com/37469
Lustre-commit: 2f1beb33144523467b596f4b6fab882b0a839187

Change-Id: I92bcf6d7b7f7a4387a9936a0b58332e50a88e542
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38051
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12198 libcfs: always copy ioctl header back to user 20/37720/3
Dominique Martinet [Thu, 13 Feb 2020 13:36:32 +0000 (13:36 +0000)]
LU-12198 libcfs: always copy ioctl header back to user

lnetctl_get_peer_list fills back the required size in header if the
given buffer was too small. Userspace needs the info back to grow
the buffer and try again.

Note we only replace err on failure if err was previously not set

Lustre-change: https://review.whamcloud.com/37559
Lustre-commit: 9e02ef474f8caa833d6a1b5e0068d5323a57e8c4

Fixes: fba98579efc4 ("LU-6202 libcfs: replace libcfs_register_ioctl with a blocking notifier_chain")
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Change-Id: I2b6e319aceeb00d488572053d27023891afe1928
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37720
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13294 libcfs: incorrect rotor behaviour 49/38049/2
Andrew Perepechko [Fri, 14 Feb 2020 02:20:09 +0000 (05:20 +0300)]
LU-13294 libcfs: incorrect rotor behaviour

Signed int cpt rotor is set to -1 on initialization.
cfs_cpt_spread_node() improperly handles this value
via "if (!rotor--)" check. The condition is never true
with negative rotor values, so for_each_node_mask()
only exits with node = MAX_NUMNODES.

kmalloc_node() attempts to determine the zonelist based
on the passed node id and maps MAX_NUMNODES to some
random pointer. Crash.

BUG: unable to handle kernel paging request at 0000000100002007
IP: [<ffffffff847c0da7>] __alloc_pages_nodemask+0x97/0x420

Lustre-change: https://review.whamcloud.com/37709
Lustre-commit: f8aa86dd1622804d81020a7dbb1116f276b340f3

Change-Id: I4df74e394bdfc2a918d66aa12e6852ff0f6738ab
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Cray-bug-id: LUS-8492
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38049
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11986 libcfs: provide QSTR_INIT compat macro 90/38090/2
Andreas Dilger [Fri, 27 Mar 2020 08:08:26 +0000 (02:08 -0600)]
LU-11986 libcfs: provide QSTR_INIT compat macro

Provide a compat macro for QSTR_INIT() for older kernels.

Fixes: 9d42660e173e ("LU-11986 lnet: properly cleanup lnet debugfs files")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ice19a4dad8456551ba398034a8d3942068006512
Reviewed-on: https://review.whamcloud.com/38090
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13148 tests: clean up sanity 56ra add debugging 21/38021/9
Andreas Dilger [Sun, 22 Mar 2020 00:17:14 +0000 (18:17 -0600)]
LU-13148 tests: clean up sanity 56ra add debugging

Consolidate duplicate code from sanity.sh test_56ra() into a
helper function to make it easier to see what is being run.

Print out the before and after values for each test.

Skip test_56ra for versions older than 2.12.4, since it was
backported in commit v2_12_3-24-gd55982d842.

Test-Parameters: trivial testlist=sanity env=ONLY=56ra,ONLY_REPEAT=50
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia28c1b556f53ea88643805cbf4ada725a53ebbe5
Reviewed-on: https://review.whamcloud.com/38021
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12580 lov: fix typo in lov_comp_md_size 50/38050/2
Li Dongyang [Mon, 10 Feb 2020 04:32:58 +0000 (15:32 +1100)]
LU-12580 lov: fix typo in lov_comp_md_size

If the component of a PFL file is not initialized,
we should use 0 as the stripe size when calculating
the LOVEA size.

Lustre-change: https://review.whamcloud.com/37493
Lustre-commit: d41716533682ed88b8a77654f9b5b050ef5c672c

Change-Id: I4ff5f4a78bc1d432cc1ac6fa3733461bd6b762e6
Fixes: 62f64a1077 ("LU-9489 lod: keep minimum LOVEA size")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38050
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13260 lov: fix size check when stripe is zero 48/38048/2
Yang Sheng [Wed, 19 Feb 2020 11:08:33 +0000 (19:08 +0800)]
LU-13260 lov: fix size check when stripe is zero

Set correct max size while stripe is zero.

Lustre-change: https://review.whamcloud.com/37623
Lustre-commit: fe57ce6adf1a00e14269b230d07a4548a58d77c3

Fixes: f3f6515562 (LU-8998 lov: add composite layout unpacking)
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I9b76283fcc65f58e3be6adf49f035236687ac85c
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38048
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11758 osp: remove assertion from statfs 93/37993/2
Sergey Cheremencev [Fri, 6 Jul 2018 19:51:14 +0000 (22:51 +0300)]
LU-11758 osp: remove assertion from statfs

Sequence can't be changed or overflowed
in case of IDIF. Thus don't tigger kernel
panic for below case:
last_created [0x100000001:0x15:0x0], next_fid [0x100000000:0xfffffff6:0x0]
The same assertion that excepts IDIFs exists
in osp_fid_diff.
Also the patch is adding several optimizations
in osp_precreate_send.

Lustre-commit: bcfd0e040d1536410ba6c301f64d4f8ea6a8797a
Lustre-change: https://review.whamcloud.com/33832

Change-Id: I3966dfc621999d065c9b485d387938085fccb140
Cray-bug-id: LUS-2386
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/37993
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12299 libcfs: fix panic for too large cpu partions 32/37332/3
Wang Shilong [Wed, 15 May 2019 01:52:37 +0000 (09:52 +0800)]
LU-12299 libcfs: fix panic for too large cpu partions

If cpu partions larger than online cpus, following calcuation
will be 0:

num = num_online_cpus() / ncpt;

And it will trigger following panic in cfs_cpt_choose_ncpus()

LASSERT(number > 0);

We actually did not support this, instead of panic
it, return failure is better.

Also fix a invalid pointer access if we failed to init @cfs_cpt_table,
as it will be converted to ERR_PTR() if error happen.

Lustre-change: https://review.whamcloud.com/34864
Lustre-commit: 77771ff24c03a59fc96a7f41199a6b73530a418a

Change-Id: I49daadd8f0c7d22aa78d08248d8c085781740768
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37332
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13160 tests: fix sanity-hsm monitor setup 73/37773/3
Li Dongyang [Mon, 17 Feb 2020 02:53:16 +0000 (13:53 +1100)]
LU-13160 tests: fix sanity-hsm monitor setup

On RHEL8, even we are using pdsh -R ssh,
the ssh still waits for the remote cat process
to finish.
Use the subshell to avoid the time out.

Lustre-change: https://review.whamcloud.com/37595
Lustre-commit: 6724d8ca58e9b8474a180b013a4723cbdd8900d9

Change-Id: Id5b8d492b5ce9a235da73448ade475ade145bbed
Test-Parameters: trivial clientdistro=el8.1 testlist=sanity-hsm
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37773
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12775 test: reorder 'tar' command options 72/37772/3
Lai Siyao [Tue, 3 Dec 2019 11:43:28 +0000 (19:43 +0800)]
LU-12775 test: reorder 'tar' command options

'tar' in RHEL8 is stricter in command option order.

Test-Parameters: trivial \
envdefinitions=ONLY="32c" \
clientdistro=el8.1 serverdistro=el8.1 \
mdscount=2 mdtcount=4 testlist=conf-sanity

Lustre-change: https://review.whamcloud.com/36907
Lustre-commit: f3e101a36310c0c2b9d516c09ec0166eb24524d2

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I814203808efae4a746166abd3ba08f2bc5fce8f7
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37772
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13274 uapi: make lustre UAPI headers C99 compliant 73/37973/2
James Simmons [Sat, 29 Feb 2020 01:49:42 +0000 (20:49 -0500)]
LU-13274 uapi: make lustre UAPI headers C99 compliant

Attempting to compile strict C99 user land applications or
libraries with the Lustre UAPI headers will fail. These same
errors can be seen by enabling CONFIG_UAPI_HEADER_TEST as well.
Update the Lustre UAPI headers to be compilable with -std=c99.
Enhance our current test covering UAPI header handling.

For OpenSFS branch we can't include <linux/stat.h> since we support
kernels before struct statx existed and they will collide with the
special definitions in lustre_user.h.

Lustre-change: https://review.whamcloud.com/37678
Lustre-commit: 7a7309fa849577ddd5a4f6bb5bfb69e84a7fec89

Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Change-Id: Ifb0da33180dc3c7e116d6bf2b7f603ad0528277a
Reviewed-on: https://review.whamcloud.com/37973
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11510 lfs: migrate a composite layout file correctly 14/37814/2
Emoly Liu [Wed, 26 Feb 2020 14:12:59 +0000 (22:12 +0800)]
LU-11510 lfs: migrate a composite layout file correctly

The patch fixes the following issues:
- in function migrate_open_files(), "layout" pointer should be used
  instead of "param" pointer to tell whether a comp file should be
  created or not, because "param" pointer is always not null and
  the composite layout file will never be created;
- make --copy and --yaml options work correctly in lfs_migrate tool;
- when a composite layout file is migrated, "--copy" option will be
  added to preserve its layout in both lfs_migrate and "lfs migrate",
  and in such situation, pool name will be saved as well;
- when a file is restriped with -R option by lfs_migrate, the file
  will be set with its parent's stripe by default, by adding
  "--copy $parent_dir" option;
- do some code cleanup in lfs_migrate and sanity.sh test_56wb/c

sanity.sh test_56xd/xe are added to verify this patch.

Lustre-change: https://review.whamcloud.com/36082
Lustre-commit: 8bedfa377fbd0c9f1b6ea2c40d36fdcaa52137df

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I85779c69e74444eb869f28add4363ad3a6835b97
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37814
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13090 utils: fix lfs_migrate -p for file with pool 13/37813/2
Andreas Dilger [Thu, 19 Dec 2019 11:51:41 +0000 (04:51 -0700)]
LU-13090 utils: fix lfs_migrate -p for file with pool

If "lfs_migrate -p <pool>" is run to migrate a file with an existing
pool, the given pool is overridden by the existing pool from the file
during migration.  Fix this to use the OST pool requested by the user.

Don't print a warning about deprecated -n option if --dry-run is used.

If a pool is specified, use it with "lfs df" to find OST free space.

Change temp filename to work better with new DNE "crush" hash.

Don't return an error if falling back to rsync and no links are found.

Add test for "lfs_migrate -p" and update man page and usage to match.
Clean up debug-level helpers in test-framework.sh.

Lustre-change: https://review.whamcloud.com/37067
Lustre-commit: 128137adfc539dd2dd92040c14a63ff27f969820

Test-Parameters: trivial testlist=ost-pools
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ief69a620fc969aeff24ec0633a3314c3b83ebbe5
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37813
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12157 utils: fix lfs_migrate output and testing 12/37812/3
Andreas Dilger [Thu, 4 Apr 2019 02:26:37 +0000 (20:26 -0600)]
LU-12157 utils: fix lfs_migrate output and testing

Don't pass the "-v" option through to "lfs migrate", as this causes
the filename to be printed twice when run with the "-v" option.

Don't use "echo -e" to process escape characters in filenames unless
this is needed, but add it where it is needed.  Don't add ANSI escape
characters to the output.  The output previously looked like:

     /mnt/testfs/l0: stripe count=1,size=1048576,pool=/mnt/testfs/l0
     done migrate
     nr[K/mnt/testfs/l1: /mnt/testfs/l1: already migrated ...
     nr[K/mnt/testfs/l2: /mnt/testfs/l1: already migrated ...
     nr[K/mnt/testfs/l3: /mnt/testfs/l1: already migrated ...

Print out the "pool=" and "mirror_count=" parameters only if needed.

Fix "-A" option to round up the number of stripes when the migrated.
Skip sanity test_56xc 1GB test if there is not enough space on OSTs.

Fixes: 60c5bc2502 ("LU-8235 scripts: pass unrecognized options to lfs migrate")
Fixes: 80a2ff7137 ("LU-6051 utils: allow lfs_migrate to handle hard links")
Fixes: 99d7a8ed43 ("LU-8207 scripts: add auto-stripe option to lfs_migrate")

Lustre-change: https://review.whamcloud.com/34592
Lustre-commit: 9b8e8e0e54e5055c02469cb16d186c94fa2040e0

Test-Parameters: trivial fstype=zfs testlist=sanity envdefinitions=ONLY=56
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I059e7daeb2fa82e7607fd9d862797433053ebbe5
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37812
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13169 tests: add ONLY_REPEAT parameter to repeat subtests 86/37586/2
Andreas Dilger [Fri, 24 Jan 2020 09:20:38 +0000 (02:20 -0700)]
LU-13169 tests: add ONLY_REPEAT parameter to repeat subtests

Add the ONLY_REPEAT environment variable, to allow tests specified
by ONLY to be run multiple times, to ensure that the test is passing
consistently (or fixing an intermittent bug).  This is faster than
restarting the test session multiple times for only a few subtests.

Have the iteration around the subshell started for run_one() so that
any registered stack_trap EXIT calls are triggered between iterations,
the fail_loc is reset, grant/health/error checks are done, and so on.

Remove $tdir and $tfile files after each iteration to avoid failures
with the subsequent subtest runs.  For tests that do not follow the
standard naming convention for test directories and files, they need
to be updated to use $tdir and $tfile, which is good in any case.

YAML output splits each iteration into a separate subtest for Maloo.
The output from run_one() is appended to a single output file for all
iterations so all output is captured instead of just the last one.

The iterations will continue until $ONLY_REPEAT loops pass, or until
the subtest hits an error.  Trying to continue for all iterations in
the face of errors would likely end up with all of later iterations
failing also due to leftover state from the previous failure, and the
goal is for the subtests to pass consistently.  If we are trying to
determine rates of intermittent failures, this can be computed using
1/num_passes about the same as num_failures/ONLY_REPEAT iterations.

Rename variables in subtests to avoid clash with testnum, testname,
and TESTNAME, and use them consistently in functions and subtests.

Lustre-change: https://review.whamcloud.com/37321
Lustre-commit: e16e3d46ee8c44e691c5cd3d25161f2f297fa0fd

Test-Parameters: testlist=sanity envdefinitions=ONLY=27l,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: I5449590dc3e25c113b059974fb7b96c892434380
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Charlie Olmstead <charlie@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37586
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13296 obd: make statfs cache working again 19/37819/2
Alexey Lyashkov [Thu, 27 Feb 2020 14:48:48 +0000 (17:48 +0300)]
LU-13296 obd: make statfs cache working again

Once statfs raced on mutex, lets read a cached data instead
of trash.

Lustre-change: https://review.whamcloud.com/37753
Lustre-commit: 7281635521a823548d497bce2f19acfa3318dfe9

Test-Parameters: testlist=sanity envdefinitions=ONLY=423,ONLY_REPEAT=500
Fixes: 1c41a6ac390b ("LU-12368 obdclass: don't send multiple statfs RPCs")
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I268782875c30c078f239c194f69cdf7506d66169
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37819
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13261 mdt: PFL layout changed while accessing 21/37821/2
Hongchao Zhang [Sun, 19 Jan 2020 06:26:10 +0000 (01:26 -0500)]
LU-13261 mdt: PFL layout changed while accessing

The PFL layout EA could be enlarged when the corresponding layout
of some IO range is started to be written, which can cause other
thread to get incorrect layout size at "mdt_intent_layout" and cause
"mdt_lvbo_fill" to fail checking the real layout size.

In Lustre, "ldlm_handle_enqueue0" has processed the error "-ERANGE"
and it will retry after expanding the layout buffer size, then it
only needs to decrease the debug level of the log in "mdt_lvbo_fill"

Lustre-change: https://review.whamcloud.com/37684
Lustre-commit: 35d01a0fc7b2933d589f5a6bc4878382cbc15b52

Change-Id: Iad722d1dac187f57ae77606a4d4587525412cd68
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37821
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11891 utils: getstripe use --mdt-index consistently 28/37728/2
Andreas Dilger [Sun, 27 Jan 2019 18:23:48 +0000 (11:23 -0700)]
LU-11891 utils: getstripe use --mdt-index consistently

LU-10856 fixed most usages of "warning: '-M' deprecated,
use '--mdt-index' or '-m' instead" but missed a few in
cases in sanity test_271d, test_271e, and test_271f.
Fix those tests to use "--mdt-index".

Also, lfs has a few places were the usage of "--mdt-index"
and "--mdt" is inconsistent.  Fix those options to be used
consistently across all commands.

Lustre-change: https://review.whamcloud.com/34116
Lustre-commit: 88d8f0f86bd4994e07aa12bd00cbc7ad3192205c

Fixes: 6c617a3d56 ("LU-10856 tests: remove deprecated lfs ...")

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I013b2198f3a39533da9a0067a0bf5846604b3052
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37728
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11810 misc: allow Fixes: tag in commit signoff block 24/37724/2
Andreas Dilger [Tue, 18 Dec 2018 19:01:04 +0000 (12:01 -0700)]
LU-11810 misc: allow Fixes: tag in commit signoff block

Allow the "Fixes:" tag in the signoff block of the commit message.
The Fixes: tag should contain a valid git commit hash, and be
followed by a description of the original patch.

Lustre-change: https://review.whamcloud.com/33888
Lustre-commit: 4d6bff6be51ff6e336f73b2822ac5100254b9431

Fixes: 5760b34c48b4 ("LU-1145 test: add Test-Parameters tag")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I49f712dd8da173510e5941b66eb050e53a1cab07
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-on: https://review.whamcloud.com/37724
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13071 lnet: reduce log severity for health events 18/37718/2
Amir Shehata [Thu, 12 Dec 2019 19:19:48 +0000 (11:19 -0800)]
LU-13071 lnet: reduce log severity for health events

No need to print an error when the health of an interface is
reduced. Changed it to debug level.

Lustre-change: https://review.whamcloud.com/37002
Lustre-commit: 5567aaba1086217e021938e5f9543c640b2d007c

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia60ade12efab732ea4b0388a3803976bf65938ab
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37718
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-11269 ptlrpc: do not expose transient IDLE state 49/37649/2
Alex Zhuravlev [Mon, 10 Feb 2020 21:06:07 +0000 (00:06 +0300)]
LU-11269 ptlrpc: do not expose transient IDLE state

to avoid cases when anyone sending an RPC observes the connection
in this state while it's going to reconnect right away.

Lustre-change: https://review.whamcloud.com/37523
Lustre-commit: ea8d2ecc783fbaff12c581935ac426b9b8567031

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9ca89051c4176fe321262f8b2f52969c382e401e
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13110 kernel: kernel update SLES12 SP4 [4.12.14-95.45.1] 24/37124/3
Jian Yu [Fri, 14 Feb 2020 19:15:56 +0000 (11:15 -0800)]
LU-13110 kernel: kernel update SLES12 SP4 [4.12.14-95.45.1]

Update SLES12 SP4 kernel to 4.12.14-95.45.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp4 \
envdefinitions=LNET_SELFTEST_EXCEPT=smoke,SANITY_EXCEPT="103a 817"

Change-Id: I1f7024465b4b6334488b7314f1073fafa10331d6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37124
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13191 osp: handle -EROFS in osp_sync_interpret() 16/37516/3
Lai Siyao [Sat, 25 Jan 2020 21:23:28 +0000 (05:23 +0800)]
LU-13191 osp: handle -EROFS in osp_sync_interpret()

Upon OST disk failure, osp_sync_interpret() may get -EROFS,
which is a valid errno.

Lustre-change: https://review.whamcloud.com/37404
Lustre-commit: 868089cd309506719b814afecebf825effc6c93f

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I5c3cff3019aa47c6d5803f0f0b373bc704f18118
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37516
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-10198 llog: keep llog handle alive until last reference 14/37514/4
Mikhail Pershin [Wed, 29 Jan 2020 21:22:07 +0000 (00:22 +0300)]
LU-10198 llog: keep llog handle alive until last reference

Llog handle keeps related dt_object pinned until llog_close()
call, meanwhile llog handle can still have other users which
took llog handle via llog_cat_id2handle()

Patch changes llog_handle_put() to call lop_close() upon last
reference drop. So llog_osd_close() will put dt_object only
when llog_handle has no more references.
The llog_handle_get() checks and reports if llog_handle has
zero reference.
Also patch modifies checks for destroyed llogs, llog handle
has new lgh_destroyed flag which is set when llog is destroyed,
llog_osd_exist() checks dt_object_exist() and lgh_destroyed
flag, so destroyed llogs are considered as non-existent too.
Previously it uses lu_object_is_dying() check which is not
reliable because means only that object is not to be kept in
cache.

Lustre-change: https://review.whamcloud.com/37367
Lustre-commit: d6bd5e9cc49b3bb9901ada503107e8b0eca44f7e

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If7df41646c243c0d40b20a30a33e86c688d24508
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37514
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12651 osc: always call update_next_shrink 72/37572/2
Alexander Zarochentsev [Tue, 4 Feb 2020 17:47:06 +0000 (20:47 +0300)]
LU-12651 osc: always call update_next_shrink

Call update_next_shrink in case of clients not
supporting grant shrinking or clients with grant
shrinking explicitely disabled. Otherwise
osc_grant_work_handler() schedules itself immediately
after its completion causing excessive CPU consumption.

Fixes: 3e070e30a98d ("LU-8708 osc: enable/disable OSC grant shrink")

Lustre-change: https://review.whamcloud.com/37429
Lustre-commit: 117f587bc3e60f4dd1c939f8488e43cb752c12ca

Cray-bug-id: LUS-8460
Change-Id: I507b3d10dd5374772456853098bc26053cbd140d
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37572
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13163 mdc: new kernel function xa_is_value() 81/37481/2
Lai Siyao [Sat, 8 Feb 2020 06:30:29 +0000 (22:30 -0800)]
LU-13163 mdc: new kernel function xa_is_value()

xa_is_value() is added in kernel 4.19-rc6 to replace
radix_tree_entry_exceptional().

This patch is back-ported from the following one:
Lustre-commit: a0a3a29deb82656f9639f46847deac2689973893
Lustre-change: https://review.whamcloud.com/37399

Test-Parameters: trivial clientdistro=el8.1 \
envdefinitions=ONLY=65i mdscount=2 mdtcount=4 \
testlist=sanity,sanity,sanity,sanity,sanity

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If89aa19c37af8a67debe782d1c77f4ef4dc6f923
Reviewed-on: https://review.whamcloud.com/37481
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13142 lod: cleanup layout checking 10/37410/2
Sebastien Buisson [Fri, 17 Jan 2020 13:15:25 +0000 (22:15 +0900)]
LU-13142 lod: cleanup layout checking

Cleanup layout checking in lod layer and lfs command-line utility,
for DoM components.

Lustre-change: https://review.whamcloud.com/37267
Lustre-commit: 4ce8a29d8bfc5b77893b642cdf2c33ceed960866

Reported-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib8b184a31d26442ed10241dc12a0452e5243d0e8
Reviewed-on: https://review.whamcloud.com/37410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13152 llapi: llapi_layout_get_by_xattr groks DoM 09/37409/2
Sebastien Buisson [Fri, 17 Jan 2020 16:31:04 +0000 (17:31 +0100)]
LU-13152 llapi: llapi_layout_get_by_xattr groks DoM

llapi_layout_get_by_xattr() function must be updated to handle
lov component with LOV_PATTERN_MDT pattern.

Lustre-change: https://review.whamcloud.com/37269
Lustre-commit: 3e23353201a753104d1fcdab28353646e40644dc

Signed-off-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6553e66cd4f3b5acc65790da94555350c98fe179
Reviewed-on: https://review.whamcloud.com/37409
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12474 tests: Do not run check_progs_installed for racer 21/37521/2
Oleg Drokin [Wed, 26 Jun 2019 02:22:23 +0000 (22:22 -0400)]
LU-12474 tests: Do not run check_progs_installed for racer

it's run from within racer so racer is already there for sure

Lustre-change: https://review.whamcloud.com/35327
Lustre-commit: e7b7433571a748cdc651c5f50f01ff5ee0656c28

Change-Id: Ifd78cd051842c9663130b650c6e35d60332250e7
Test-Parameters: testlist=racer
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37521
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
4 years agoLU-13117 libcfs: fix to match right key in cfs_get_environ() 96/37396/2
Wang Shilong [Wed, 8 Jan 2020 01:45:27 +0000 (09:45 +0800)]
LU-13117 libcfs: fix to match right key in cfs_get_environ()

It does the memcmp() to match the environment variable
with the desired key, then accounts for the "=" when
calculating length. But it fails to check that the next
character is actually an equals sign. In the case of
any key which is also the prefix to some other variable

Also add debug information for debugging similar issue
in the future.

Lustre-change: https://review.whamcloud.com/37156
Lustre-commit: 31170f9ceca91684ea66e0b16757881563a8cf26

Test-Parameters: trivial
Change-Id: Ia2b4ccd1f10c89059cecc224d4e2ba8d1d75b825
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11607 tests: replace version/fstype calls in conf-sanity 42/36942/5
James Nunez [Wed, 7 Aug 2019 21:12:51 +0000 (15:12 -0600)]
LU-11607 tests: replace version/fstype calls in conf-sanity

The routine get_lustre_env() is available to all Lustre test
suites and sets an environment variable for the file system
type for MDS1 and OST1 and sets a variable for the Lustre
version of servers.

In conf-sanity, replace the calls to facet_fstype() and
lustre_version_code() for all server types defined in
get_lustre_env().  While doing this, replace SINGLEMDS with
mds1 in these calls.

Clean up around any modifications with
- converting spaces to tabs
- removing calls to return after skip() or skip_env()

Lustre-change: https://review.whamcloud.com/35721
Lustre-commit: b4c955fe72d8598c4eaf98b809ae42be94f8c40b

Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: fstype=zfs testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I17707883d46aa66c32e1229107646bc7a9df5e4e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36942
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13154 test: skip sanity-quota 66 if MDS version < 2.12.4 50/37350/2
Wang Shilong [Sun, 19 Jan 2020 02:16:25 +0000 (10:16 +0800)]
LU-13154 test: skip sanity-quota 66 if MDS version < 2.12.4

Since LU-12826 landed after this version, add version check to
make interop test pass.

Lustre-change: https://review.whamcloud.com/37276
Lustre-commit: f99fa029fd904ac13f33ab82de37fd07a69aea84

Test-Parameters: trivial envdefinitions=ONLY=66 testlist=sanity-quota
Change-Id: I829f424b9bb103e18c06de6f797827f82e1874d1
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37350
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoNew release 2.12.4 2.12.4 b2_12_4
Oleg Drokin [Tue, 11 Feb 2020 19:03:27 +0000 (14:03 -0500)]
New release 2.12.4

Change-Id: I5c3d94ca134daae770414b220966fe3e92e5a10d
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 years agoNew tag 2.12.4-RC2 2.12.4-RC2 v2_12_4-RC2
Oleg Drokin [Sat, 8 Feb 2020 05:56:46 +0000 (00:56 -0500)]
New tag 2.12.4-RC2

Change-Id: I637563481546e29b6fd648e2269287cdd2577910
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13194 tests: check server version sanityn 104 20/37420/6
James Nunez [Tue, 4 Feb 2020 04:15:10 +0000 (21:15 -0700)]
LU-13194 tests: check server version sanityn 104

Check the server version before running sanityn test 104.
If the server version is less than 2.12.4, skip the test.

Lustre-change: https://review.whamcloud.com/37461/
Lustre-commit: a8b9a123fea3762b999e80f56fdbbdf2ea10e280

Fixes: d2f7cb7934a0 ("LU-12026 mdt: MDS stores atime|mtime|ctime")

Test-Parameters: trivial serverversion=2.11.0 serverdistro=el7 envdefinitions=ONLY=104 testlist=sanityn
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I625fb0163c078dc95ed670d169dc5744bc16d4e8
Reviewed-on: https://review.whamcloud.com/37420
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13145 lnet: use conservative health timeouts 90/37390/3
Andreas Dilger [Fri, 31 Jan 2020 20:00:00 +0000 (13:00 -0700)]
LU-13145 lnet: use conservative health timeouts

Use more conservative lnet_transaction_timeout and lnet_retry_count
values by default.  Currently with timeout=10 and retry=3 there is
only a 3s window for the RPC to be sent before it is timed out.
This has caused fault injection rather than fault tolerance.
Increase the default timeout to 50s with retry=2, which is hopefully
long enough to cover virtually all uses, but still allows LNet Health
to be enabled by default and resend before Lustre times out itself.

Fixes: 8632e94aeb7e ("LU-11816 lnet: setup health timeout defaults")
Lustre-change: https://review.whamcloud.com/37430
Lustre-commit: 361e9eaef13c0f472ad45388d3e147dabc32b737

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6bfc4d61cebab38c1554e1b42834b1f38fc34ba8
Reviewed-on: https://review.whamcloud.com/37390
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12593 osd: up i_append_sem during errors 45/37445/3
Alexander Boyko [Mon, 3 Feb 2020 09:24:40 +0000 (04:24 -0500)]
LU-12593 osd: up i_append_sem during errors

There is a potential leak of i_append_sem during errors for
buffer head read and ldiskfs_joural_get_write_access() at
osd_ldiskfs_write_record().
The patch adds up(i_append_sem) for errors paths.

Lustre-change: https://review.whamcloud.com/37406/
Lustre-commit: 7599dd3d20d6bb4ee89634c5a76730481ca62470

Fixes: f832a7dc33c6 ("LU-12593 osd: zeroing a freshly allocated block buffer")
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I245d0c45af03519c66b75731e5d57f42de41fe95
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37445
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13099 lmv: disable statahead for remote objects 70/37370/2
Vladimir Saveliev [Mon, 23 Dec 2019 11:07:25 +0000 (14:07 +0300)]
LU-13099 lmv: disable statahead for remote objects

Statahead for remote objects is supposed to be disabled by
LU-11681 lmv: disable remote file statahead.

However due to typo it is not and statahead for remote objects is
accompanied by warnings like:
  ll_set_inode()) Can not initialize inode .. without object type..
  ll_prep_inode()) new_inode -fatal: rc -12

Fix the typo.

Test to illustrate the issue is added.

Lustre-change: https://review.whamcloud.com/37089
Lustre-commit: 68330379b01cb6bf9b24235a80a4666d24c0e343

Fixes: 02b5a407081c ("LU-11681 lmv: disable remote file statahead")
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Cray-bug-id: LUS-8262
Change-Id: Id9cc7f30ba75918658bf8eb1c8f3249993da6699
Reviewed-on: https://review.whamcloud.com/37370
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13121 llite: fix deadlock in ll_update_lsm_md() 25/37325/3
Lai Siyao [Wed, 22 Jan 2020 05:55:27 +0000 (13:55 +0800)]
LU-13121 llite: fix deadlock in ll_update_lsm_md()

Deadlock may happen in in following senario: a lookup process called
ll_update_lsm_md(), it found lli->lli_lsm_md is NULL, then
down_write(&lli->lli_lsm_sem). but another lookup process initialized
lli->lli_lsm_md after this check and before write lock, so the first
lookup process called up_read(&lli->lli_lsm_sem) and return, so the
write lock is never released, which cause subsequent lookups deadlock.

Rearrange the code to simplify the locking:
1. take read lock.
2. if lsm was initialized and unchanged, release read lock and return.
3. otherwise release read lock and take write lock.
4. free current lsm and initialize with new lsm.
5. release write lock.
6. initialize stripes with read lock.

Lustre-change: https://review.whamcloud.com/37182
Lustre-commit: 3746550282c865deebb07bfd92bcb4d1dabdc675

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ifcc25a957983512db6f29105b5ca5b6ec914cb4b
Reviewed-on: https://review.whamcloud.com/37325
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoNew RC 2.12.4-RC1 2.12.4-RC1 v2_12_4-RC1
Oleg Drokin [Tue, 28 Jan 2020 22:39:51 +0000 (17:39 -0500)]
New RC 2.12.4-RC1

Change-Id: Ia0ed234bd5b7ffb74f1c1ec73190a34504f05496
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11385 odbclass: Handle gracefully if nsproxy is NULL 14/37314/2
Serguei Smirnov [Tue, 19 Nov 2019 22:18:17 +0000 (14:18 -0800)]
LU-11385 odbclass: Handle gracefully if nsproxy is NULL

Gracefully handle the case if current->nsproxy is NULL:
check for the condition and return an error, avoiding attempts
to dereference the pointer.

Lustre-change: https://review.whamcloud.com/36802
Lustre-commit: 15278c6d32a5a9a7a2b8ac9e08c8702383e0c2ff

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia102d2bacdb0e54b0339985396447e6d25465c56
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37314
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12637 kernel: new kernel [RHEL 8.1 4.18.0-147.3.1.el8_1] 86/37186/4
Jian Yu [Fri, 3 Jan 2020 07:28:20 +0000 (23:28 -0800)]
LU-12637 kernel: new kernel [RHEL 8.1 4.18.0-147.3.1.el8_1]

This patch makes changes to support new RHEL 8.1 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.1 \
envdefinitions=SANITY_EXCEPT="411 817" \
testlist=sanity

Lustre-change: https://review.whamcloud.com/36946
Lustre-commit: 97e93c8f267a7d9fb9ee6d96b040236172a7f247

Change-Id: Ifcc0a15c3ad9afa99b670641f91b23c1a5c0668e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37186
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11385 lnet: check if current->nsproxy is NULL before using 13/37313/2
Sonia Sharma [Sat, 30 Mar 2019 08:32:34 +0000 (01:32 -0700)]
LU-11385 lnet: check if current->nsproxy is NULL before using

A crash is seen at few sites in the function
rdma_create_id(current->nsproxy->net_ns, cb, dev, ps, qpt).
The issue is identified with the first param in this
function - current->nsproxy->net_ns. There is a
possibility that this value is NULL and resulting in
"kernel NULL pointer dereference" crash.

Handle the case of NULL value gracefully by adding
a check and using init_net if current or
current->nsproxy is NULL.

Lustre-change: https://review.whamcloud.com/34577
Lustre-commit: ef1783e282f6eba9d69b0957f1b5fed00be0cbd6

Change-Id: I06349e081f2c4ba0480b3924fc304f94ca765891
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37313
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12853 ptlrpc: zero session enviroment 05/37305/2
Alexander Boyko [Mon, 14 Oct 2019 07:31:35 +0000 (03:31 -0400)]
LU-12853 ptlrpc: zero session enviroment

handle_recovery_req() set le_ses for request processing,
and doesn't zero it after. This leads to accessing freed memory
at keys_fill() later.

The patch also adds a cleanup for xxx_env_info, makes them equal
and combines to a single function.

Lustre-change: https://review.whamcloud.com/36443
Lustre-commit: 2a620f07e23b3b044f429f049bcc5ffa96f6d844

Cray-bug-id: LUS-7676
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ifad95c1177258b6f71effe5fa815f68c8426c516
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37305
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13098 ptlrpc: supress connection restored message 15/37315/2
Alex Zhuravlev [Sat, 21 Dec 2019 15:40:20 +0000 (18:40 +0300)]
LU-13098 ptlrpc: supress connection restored message

if that happens on idling connection.

Lustre-change: https://review.whamcloud.com/37086
Lustre-commit: 7aa58847b94d0ebb2796774a2de2183ba7f8cc4b

Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I506665d427f3e77477f53e2d3059bcb1daaf0318
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37315
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12799 ptlrpc: return proper error code 64/37164/3
Alex Zhuravlev [Tue, 24 Sep 2019 20:29:01 +0000 (23:29 +0300)]
LU-12799 ptlrpc: return proper error code

from ptlrpc_disconnect_prep_req() using ERR_PTR()
as the callers expect.

Lustre-change: https://review.whamcloud.com/36282
Lustre-commit: 9e2620d75cce1e1b4855704ddd9a994ce8e8d650

Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")
Change-Id: I5493194a1f18f3d0b559921b7859bf835585ba58
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/37164
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13092 lbuild: include lbuild-{fc,rhel,sles} to SIGNATURE 12/37312/2
Wang Shilong [Thu, 9 Jan 2020 01:34:28 +0000 (09:34 +0800)]
LU-13092 lbuild: include lbuild-{fc,rhel,sles} to SIGNATURE

We should include these files to calculate SIGNATURE, for example
bump kernel extra tags could happen there.

Lustre-change: https://review.whamcloud.com/37076
Lustre-commit: b39e1e6e3e4ea396ad842ec3695f45cfd5dfb79e

Test-Parameters: trivial
Change-Id: I2c62ad765d3c6a1b9e99affe3be95a404d6140c5
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12988 osd: do not use preallocation during mount 55/37155/3
Alex Zhuravlev [Thu, 14 Nov 2019 15:13:16 +0000 (18:13 +0300)]
LU-12988 osd: do not use preallocation during mount

as cold mballoc cache can cause very lengthy search.

Lustre-commit: ae21fce625ec6cd134fa4764683f00bc692132cb
Lustre-change: https://review.whamcloud.com/36704

Change-Id: I821b023d392336f0085a96e821dc22e92dbf23b7
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/37155
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13087 target: init lcd last transno from reply data 87/37187/2
Mikhail Pershin [Thu, 5 Dec 2019 21:23:01 +0000 (00:23 +0300)]
LU-13087 target: init lcd last transno from reply data

Init lcd_last_transno value from reply data to keep it
valid so tgt_release_reply_data() will keep a slot with
the highest transno and on-disk data is not lost.

Lustre-change: https://review.whamcloud.com/37060
Lustre-commit: 52c1cbaa7db7505642b64b2d85448d506a444661

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id31b3b250616fb6afd3d145c31b12af30ac86be8
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37187
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11911 lov: fix lov_iocontrol for inactive OST case 26/37226/2
Vladimir Saveliev [Fri, 1 Feb 2019 00:16:29 +0000 (03:16 +0300)]
LU-11911 lov: fix lov_iocontrol for inactive OST case

For inactive OSTs lov->lov_tgts[index]->ltd_exp is
NULL. lov_iocontrol() is to check that before dereferencing to
lov->lov_tgts[index]->ltd_exp->exp_obd.

Lustre-change: https://review.whamcloud.com/34148
Lustre-commit: 0facd12afa33c61e4123f6e793d232d8c814fbec

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-6937
Test-Parameters: trivial
Change-Id: I4bb332ee2c50b07a1471035556f4d77a3559847f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37226
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13061 osp: check catlog FID after reading in 85/37185/3
Hongchao Zhang [Thu, 19 Dec 2019 02:52:29 +0000 (21:52 -0500)]
LU-13061 osp: check catlog FID after reading in

In osp_sync_llog_init, the catlog FID read from "CATALOGS"
should be checked whether it is sane or not.

Lustre-change: https://review.whamcloud.com/36998
Lustre-commit: 4597fa7d884de0f1a1b030052d4d34983fed6109

Change-Id: I4342b21b7d5c6d408a9ab52a1e30815ae1d5f563
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37185
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11770 misc: fix bdev_integrity_enabled definition 67/37167/2
Li Dongyang [Thu, 9 Jan 2020 10:33:24 +0000 (21:33 +1100)]
LU-11770 misc: fix bdev_integrity_enabled definition

part of the patch was missed when it was backported
to b2_12, as a result bdev_integrity_enabled will
always defined as a function just returns false.

Change-Id: I9c9a83f3011f939e7f6d72140c08943d82a5416d
Fixes: b14e6617b9 ("LU-11770 osc: allow build without blk_integrity or crc-t10pi")
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/37167
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13059 kernel: kernel update RHEL7.7 [3.10.0-1062.9.1.el7] 61/36961/5
Jian Yu [Mon, 6 Jan 2020 07:47:24 +0000 (23:47 -0800)]
LU-13059 kernel: kernel update RHEL7.7 [3.10.0-1062.9.1.el7]

Update RHEL7.7 kernel to 3.10.0-1062.9.1.el7.

Test-Parameters: trivial clientdistro=el7.7 serverdistro=el7.7

Change-Id: I65a65db8cf044b1b91d5b116746efda9383fcf48
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36961
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11867 osd-ldiskfs: FID in LMA mismatch won't block create 35/37135/3
Lai Siyao [Mon, 7 Jan 2019 03:37:48 +0000 (11:37 +0800)]
LU-11867 osd-ldiskfs: FID in LMA mismatch won't block create

Sometimes two OST objects may be mapped to the same inode, so the
second object FID mismatch with FID in inode LMA, in this case,
if this inode was not written yet, it's safe to set object inode
to NULL to let it create a new inode.

Another case is if the mapped inode doesn't exist, it's also safe
to not initialize inode and return 0, so that create can succeed.

Add sanity-scrub.sh 4d for this.

Lustre-change: https://review.whamcloud.com/34052
Lustre-commit: cbf59ba6a56086c53a15622db7fa9f95d9798b7f

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ic84cdeaca2ea202ab0c01a0075a2f9ee8627f508
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37135
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12759 osc: don't re-enable grant shrink on reconnect 52/37152/2
Alexander Zarochentsev [Wed, 10 Jul 2019 18:37:33 +0000 (21:37 +0300)]
LU-12759 osc: don't re-enable grant shrink on reconnect

client requests grant shrinking support on each
reconnect and re-enables the capability even it was
explicitly disabled by lctl set_param.

Lustre-change: https://review.whamcloud.com/36177
Lustre-commit: efa3425c5f5a6763ea834408b982e4df5a90c914

Cray-bug-id: LUS-7585
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Change-Id: I87b1718022ee3346c9b177890a118410c5757458
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37152
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12026 mdt: MDS stores atime|mtime|ctime during close 69/36869/3
Qian Yingjin [Wed, 25 Sep 2019 09:14:12 +0000 (17:14 +0800)]
LU-12026 mdt: MDS stores atime|mtime|ctime during close

In order to make direct inode scanning on the MDT useful, in
addition to storing the file size/blocks via LSOM on the MDT, we
also need to store the atime/mtime/ctime on the MDT inodes.

Currently the atime is already lazily updated on the MDS (at
close time). In this patch, the final mtime/ctime are sent to the
MDS at close time and updated on the MDT inode, and make MDT-only
scanning workable.

Lustre-change: https://review.whamcloud.com/36286
Lustre-commit: d2f7cb7934a0b38fa9503e8257f2b70ed656c11d

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I4465281a03d70919c388cb241c16eebcb03e850f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36869
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13070 mdd: try old format for orphan names during recovery 29/37129/3
Artem Blagodarenko [Tue, 17 Dec 2019 09:12:36 +0000 (12:12 +0300)]
LU-13070 mdd: try old format for orphan names during recovery

mdd_orphan_destroy() loop caused by compatibility issue on upgrade to
2.11 or later. The format for names of orphans in the PENDING directory
was changed in Lustre 2.11. The old format names are not recognized by
mdd_orphan_destroy() in Lustre 2.11, but compatibility code added to
handle this was incomplete, leading to an endless loop. There's a check
for the old format name, used in mdd_orphan_delete(), but that check
was not included in mdd_orphan_destroy().

This patch adds compatibility check for mdd_orphan_destroy().

Lustre-change: https://review.whamcloud.com/37049
Lustre-commit: 05fca4be33067f24a02e527c88cff5b60a20bb39

Fixes: a02fd4573fe ("LU-7787 mdd: clean up orphan object handling")
Signed-off-by: Artem Blagodarenko <c17828@cray.com>
Cray-bug-id: LUS-8270
Change-Id: I9f42188dcb00f9d536996c14771de7df02502b40
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37129
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13043 quota: remove annoying message in osd_declare_inode_qid() 31/37131/3
Wang Shilong [Tue, 3 Dec 2019 06:32:22 +0000 (14:32 +0800)]
LU-13043 quota: remove annoying message in osd_declare_inode_qid()

The admin shouldn't be getting console error messages when a user goes
over quota(this would be happening continuously at some sites).

In some call paths, the "*flags" parameter may be NULL, don't try to
access it in that case.

As a general cleanup, move the QUOTA_FL_* flags over to a named enum
"enum osd_quota_local_flags" so that it is easier to see what this field
actually holds, rather than a totally generic "int *flags" argument that
has to be hunted through the code.

Lustre-change: https://review.whamcloud.com/36906
Lustre-commit: b3005155317b27e19c8029e6a9f92e69d0dd905e

Fixes: d30f9e6b6c5d ("LU-11425 quota: support quota for DoM")
Change-Id: Id5686ecdb8a943e48a2888067e321f83b8569188
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13077 pfl: cleanup xattr checking 32/37132/2
Sebastien Buisson [Fri, 13 Dec 2019 16:39:08 +0000 (01:39 +0900)]
LU-13077 pfl: cleanup xattr checking

Cleanup xattr checking in mdd and lod layers for PFL.

Lustre-change: https://review.whamcloud.com/37010
Lustre-commit: f765c6ceb8a4a2415a7956498f7fdaefa477ba55

Reported-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2841b615ee304785fbf316b829d8280eefc3878a
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37132
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12898 utils: %llu mismatch with type __u64 on ppcle64 30/37130/2
Olaf Faaland [Tue, 22 Oct 2019 16:44:51 +0000 (09:44 -0700)]
LU-12898 utils: %llu mismatch with type __u64 on ppcle64

Fix build errors like this one on ppcle64:

BUILDSTDERR: libmount_utils_zfs.c: In function 'zfs_mkfs_opts':
BUILDSTDERR: libmount_utils_zfs.c:573:5: error: format '%llu' expects
argument of type 'long long unsigned int', but argument 4 has type
'__u64' [-Werror=format=]
BUILDSTDERR:      mop->mo_device_kb * 1024);

__u64 was treated as an unsigned long long which breaks the build on
ppc64le, where they are not the same size.

In printf cases, cast to unsigned long long to match the printf format
so the format is compatible with the type and it is guaranteed
not to lose any data.

In the case of sscanf(), replace the call with strtoull() to eliminate
the issue.

Lustre-change: https://review.whamcloud.com/36558
Lustre-commit: 56b4b112a497661de8dbf5a851c7a045d470deff

Test-Parameters: trivial
Change-Id: I02fd82e0be4d756881c15aa9faedb9b40961661a
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37130
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12791 kernel: kernel update RHEL 8.0 [4.18.0-80.11.2.el8_0] 28/36528/4
Jian Yu [Fri, 13 Dec 2019 07:04:15 +0000 (23:04 -0800)]
LU-12791 kernel: kernel update RHEL 8.0 [4.18.0-80.11.2.el8_0]

Update RHEL 8.0 kernel to 4.18.0-80.11.2.el8_0 for Lustre client.

Test-Parameters: trivial clientdistro=el8 \
testlist=sanity

Change-Id: I4081719fa9a8c83ea0e8bff46dc9d54774cabb56
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36528
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11656 llite: fetch default layout for a directory 72/37072/3
Jian Yu [Tue, 19 Nov 2019 22:19:24 +0000 (14:19 -0800)]
LU-11656 llite: fetch default layout for a directory

For a directory that does not have trusted.lov xattr, the current
"lfs getstripe" will only print the stripe_count, stripe_size,
and stripe_index that are fetched from the /sys/fs/lustre/lov values.
It doesn't show the actual default layout that will be used when
new files will be created in that directory.

This patch fixes the above issue in ll_dir_getstripe_default() by
fetching the layout from root FID after ll_dir_get_default_layout()
returns -ENODATA from a directory that does not have trusted.lov xattr.

Lustre-change: https://review.whamcloud.com/36609
Lustre-commit: 3e8fa8a7396cd029cb0d7714a324343eed7f535e

Change-Id: Icbf1f8f4fa5e5b8788217fcb0cfd24a3b80a27d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37072
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11907 dne: allow access to striped dir with broken layout 39/36939/4
Lai Siyao [Sun, 14 Apr 2019 20:12:54 +0000 (04:12 +0800)]
LU-11907 dne: allow access to striped dir with broken layout

Sometimes the layout of striped directories may become broken:
* creation/unlink is partially executed on some MDT.
* disk failure or stopped MDS cause some stripe inaccessible.
* software bugs.

In this situation, this directory should still be accessible,
and specially be able to migrate to other active MDTs.

This patch add this support on both server and client: don't
imply stripe FID is sane, and when stripe doesn't exist, skip
it.

Add OBD_FAIL_MDS_STRIPE_FID to simulate insane stripe FID, and
OBD_FAIL_MDS_STRIPE_CREATE to simulate stripe creation failure.

Add sanity 60h.

Lustre-change: https://review.whamcloud.com/34750
Lustre-commit: d2725563e7afa17a41a53aa65255a31380606d23

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8a05a0e0cef8b051a935b3fa3d3e26c0b6ef3b4a
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36939
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11673 tests: replace obsolete '-o' to '||' 29/36929/3
James Nunez [Thu, 5 Dec 2019 16:32:46 +0000 (09:32 -0700)]
LU-11673 tests: replace obsolete '-o' to '||'

Since use of -o and -a are marked as obsolete in shell
test ([), we need to switch from using [ expr1 –o expr2 ]
to [ expr1] || [ expr2 ].

Make this change for sanity tests.

This is a partial back port of:
Lustre-change: https://review.whamcloud.com/33670
Lustre-commit: 6d277f126df7605d402255333180b0ca03991613

Test-Parameters: trivial
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Id87580d0280a716a6939a1203ae5b370e762d6ec
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12895 mdt: check if object exists first 32/37032/3
Sebastien Buisson [Thu, 31 Oct 2019 11:33:45 +0000 (20:33 +0900)]
LU-12895 mdt: check if object exists first

Make sure object exists before trying to get its attr.

Lustre-change: https://review.whamcloud.com/36629
Lustre-commit: ca68e3d677a371497586167a2318268db1d94cab

Test-Parameters: clientselinux mdtcount=4 envdefinitions=ONLY=185a testlist=sanity,sanity,sanity,sanity
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idb2cd5d6e3fdf7998040b933be54a001a0e5391b
Reviewed-on: https://review.whamcloud.com/37032
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12469 mdd: handle migrate case with SELinux 31/37031/2
Sebastien Buisson [Wed, 6 Nov 2019 12:51:55 +0000 (21:51 +0900)]
LU-12469 mdd: handle migrate case with SELinux

In case a metadata object is created for migration purpose,
its security context should not be initialized. The
security.selinux xattr will be copied after creation, just like
any other xattr, so that the migrated object has the right security
context.

Lustre-change: https://review.whamcloud.com/36684
Lustre-commit: 8a60fa2e2fcd28c2772d90e76d36430d30b01905

Test-Parameters: clientselinux mdtcount=4 envdefinitions=ONLY=230 testlist=sanity,sanity,sanity,sanity
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0bc274426c003f8081da2f4d1e8e6c12a70b9930
Reviewed-on: https://review.whamcloud.com/37031
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-12944 mdd: pass correct xattr size to lower layers 30/37030/2
Sebastien Buisson [Wed, 6 Nov 2019 17:31:08 +0000 (02:31 +0900)]
LU-12944 mdd: pass correct xattr size to lower layers

In mdd_iterate_xattrs(), struct lu_buf allocated to store xattr value
can be reused for multiple xattrs, because it is only reallocated if
it happens to be too small for one xattr.
As a consequence, lb_len field does not represent actual xattr's size.
It has to be adjusted when passed to lower layers.

Lustre-change: https://review.whamcloud.com/36689
Lustre-commit: e5e584fd386a2229809bc64d440c3255cf50c1bd

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I26b54759b4e69fbac17a1032bbc724b796d78108
Reviewed-on: https://review.whamcloud.com/37030
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11956 mdd: do not reset original lu_buf.lb_len 29/37029/2
Li Dongyang [Thu, 27 Jun 2019 03:25:45 +0000 (13:25 +1000)]
LU-11956 mdd: do not reset original lu_buf.lb_len

In mdd_iterate_xattrs(), we are resetting the xbuf.lb_len
to a smaller value returned by linkea_overflow_shrink().

If that's the last xattr we gonna process, we could deduct
less than originally allocated size from obd_memory stats,
failing the memleak check later.

Lustre-change: https://review.whamcloud.com/35333
Lustre-commit: 94a5bc1bcb6c6373ead5b091ff5915dfe452377b

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I6175a91c61ceb0e37ab889d0cfd904f4993ab5cc
Reviewed-on: https://review.whamcloud.com/37029
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-12965 obdclass: remove assertion for imp_refcount 66/37066/2
Li Dongyang [Wed, 13 Nov 2019 04:01:25 +0000 (15:01 +1100)]
LU-12965 obdclass: remove assertion for imp_refcount

After calling obd_zombie_import_add(), obd_import could
be freed by obd_zombie before we check imp_refcount with
LASSERT_ATOMIC_GE_LT. It's a use after free and could
crash the box.

Lustre-change: https://review.whamcloud.com/36743
Lustre-commit: dd71e74fecf45b81daa27c89c0b8065a58cac5c1

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I3d63acf2bff543924ca0e74a35d24c507d68f6aa
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37066
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-8207 scripts: add auto-stripe option to lfs_migrate 58/36958/2
Nathan Dauchy [Mon, 2 Jul 2018 14:21:35 +0000 (10:21 -0400)]
LU-8207 scripts: add auto-stripe option to lfs_migrate

Add a "-A" flag to lfs_migrate, which will automatically select the
stripe count as the file is rewritten. Initial algorithm to
determine stripe count is sqrt(size_in_GB)+1, with an additional cap
on object size, though the algorithm or thresholds could conceivably
change in the future.  The primary intent for this feature is to be
able to give users a tool to fix stripe settings on existing files
based on file size.

A new "-C" flag specifies the object size cap.  On each OST, the
amount of space available for migration is capped by dividing the
free space of the smallest OST by the specified value.

A new "-M" flag allows OSTs with free space less than the specified
value to be considered unavailable for migration.

A new "-v" flag increases verbosity to help debug what is being done.

A new "-X" flag limits the amount of free space on each OST that
can be used for migration to the specified value.  This flag is
useful for testing by simulating OSTs that are nearly full.

A new sanity test verifies the operation of the new "-A" flag.

Lustre-change: https://review.whamcloud.com/20552
Lustre-commit: 99d7a8ed43be126b2769ad8bb0b5350cd328ed7f

Test-Parameters: trivial
Signed-off-by: Nathan Dauchy <nathan.dauchy@nasa.gov>
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I9ce8b64e028d9abb66b6b49cf7675263fd7202f0
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12826 mdt: limit root to change project state by default 56/37056/2
Wang Shilong [Tue, 22 Oct 2019 06:15:02 +0000 (14:15 +0800)]
LU-12826 mdt: limit root to change project state by default

The current project quota implementation allows users to
change the Project ID of files for which they have write
permission to any value. This is not useful if the project
quota is intended to be enforced instead of only being used
for quota accouting.

Change it so that by default only root can change the projid
of a file. Setting "mdt.*.enable_chprojid_gid" will allow
users with the specified numeric Group ID (eg. 1 = "admin") to
also change the projid of a file. Use "-1" to return the previous
behavior where all users can change the projid of their files.

Lustre-change: https://review.whamcloud.com/36544
Lustre-commit: 8fad70c0872ba13133024e4abf53a0bbee7ba1e9

Change-Id: I91c138d29f4d0b9bc607528d86893451904c9892
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37056
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12928 gss: crash in sec2target_str() 99/36999/2
Yang Sheng [Thu, 7 Nov 2019 18:48:43 +0000 (02:48 +0800)]
LU-12928 gss: crash in sec2target_str()

The timer_setup() API has being used since 3.10.0-957.x
kernel. So change gck_timer to a embedded struct to avoid
crashed on new timer API.

Lustre-change: https://review.whamcloud.com/36708
Lustre-commit: 5b40c9b90b44ddd0b042c12c10c65c9965a9856f

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ie12e21bca4169746016c8ac0e3ee4a125893ebf6
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36999
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12920 build: replace ed with sed 14/37014/2
Minh Diep [Thu, 31 Oct 2019 14:26:03 +0000 (07:26 -0700)]
LU-12920 build: replace ed with sed

Ed commad is very old

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/36630
Lustre-commit: 9e11ac388bd85967222dd5cb5ecade1d9b8f67a8

Change-Id: I18ffe50c3fb006182e68460c03a4d34d5011e62a
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12967 tgt: clean up sync_on_cancel references 38/37038/2
Andreas Dilger [Thu, 14 Nov 2019 02:49:23 +0000 (19:49 -0700)]
LU-12967 tgt: clean up sync_on_cancel references

Clean up the use of "sync_on_cancel" in the code, since the tunable
parameter is named "sync_lock_cancel" and using the same name in
the code makes it easier to find the related parts.

Rename constants to be more consistent:
  NEVER_SYNC_ON_CANCEL    -> SYNC_LOCK_CANCEL_NEVER
  BLOCKING_SYNC_ON_CANCEL -> SYNC_LOCK_CANCEL_BLOCKING
  ALWAYS_SYNC_ON_CANCEL   -> SYNC_LOCK_CANCEL_ALWAYS

Initialize sync_lock_cancel_states[] with designated initializers
so that the state names always match the declared values.

Use ARRAY_SIZE() instead of needing NUM_SYNC_ON_CANCEL_STATES.

Lustre-change: https://review.whamcloud.com/36754
Lustre-commit: 52a5981be4df863088168b3ea41fac9e29ddf060

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If7c6015420a5c3266a13798fd8b96539323ebbe5
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37038
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12967 ofd: restore sync_on_lock_cancel tunable 37/37037/2
Andreas Dilger [Thu, 14 Nov 2019 00:56:35 +0000 (17:56 -0700)]
LU-12967 ofd: restore sync_on_lock_cancel tunable

The "ofd.*.sync_on_lock_cancel" tunable was inadvertently replaced
during procfs->sysfs changes in 2.12 with "sync_lock_cancel".  Restore
the "sync_on_lock_cancel" tunable since it has existed since the 2.0
release and is definitely in use with several systems.

It isn't just a matter of restoring the old tunable name, since the
"mdt.*.sync_lock_cancel" name is also used since 2.8 and the code for
the two tunables was recently consolidated in the server target code.

Instead, keep the common "sync_lock_cancel" tunable name, add backward
compatibility for "sync_on_lock_cancel" for a number of releases, and
print a deprecation warning if the old name is used.

Fix up sanity.sh test_80 to check for both the old and new names,
but only if we actually need to change this tunable for ZFS, along
with minor test script style cleanups.

Fixes: 7059644e9ad3 ("LU-8066 ofd: migrate from proc to sysfs")

Lustre-change: https://review.whamcloud.com/36748
Lustre-commit: 7df7347b7b188e7168e094304fd6d2d985f7f274

Change-Id: Iffe65f6268d94075c71b96d42fe60ef11ac39448
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37037
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12856 target: check FLFLAGS are valid while accessing them 00/37000/2
Mikhail Pershin [Thu, 31 Oct 2019 20:44:38 +0000 (23:44 +0300)]
LU-12856 target: check FLFLAGS are valid while accessing them

While checking OBD_FL_SHORT_IO flag check first that OBD_MD_FLFLAGS
are valid.

Lustre-change: https://review.whamcloud.com/36632
Lustre-commit: 707f5a982e895c9a484dcdb8d1644e3f63c7c5cc

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I04ac61141d70883c29a113fac3985ac81cc878af
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37000
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-9341 lod: Add special O_APPEND striping 07/37007/2
Patrick Farrell [Wed, 28 Aug 2019 16:54:37 +0000 (12:54 -0400)]
LU-9341 lod: Add special O_APPEND striping

Files opened with O_APPEND are almost always log files,
which generally stay small and do not benefit from being
striped widely.  Additionally, PFL files accessed with
O_APPEND are fully instantiated, meaning that because the
files usually stay small, these objects usually wasted.

This patch adds special striping for files created with
O_APPEND.  This is controlled on the MDS by two new proc
variables:
mdd_append_stripe_count
mdd_append_pool

If the stripe count is set to 0 and the pool is not set,
this functionality is disabled and files created with
O_APPEND will be striped like any other file.

Lustre-change: https://review.whamcloud.com/35617
Lustre-commit: e2ac6e1eaa108eef3493837e9bd881629582ea1d

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I433d1b8c80488a851b8eb26c78cf5519a6cd75bf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37007
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12671 mdd: rename mdd/sync_perm to sync_permissions 06/37006/2
James Simmons [Wed, 21 Aug 2019 20:16:13 +0000 (16:16 -0400)]
LU-12671 mdd: rename mdd/sync_perm to sync_permissions

Commit e783bbff accidentally renamed a sysfs variable when moving.
Change the sysfs file to it proper name

Test-Parameters: trivial testlist=replay-vbr

Lustre-change: https://review.whamcloud.com/35851
Lustre-commit: 55a7e2dcecaf482c40840840db2b0b795bad2bb9

Change-Id: I56e0534506271cf6760f775a9c8fa99b12683861
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37006
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-8066 mdd: migrate from proc to sysfs 05/37005/2
James Simmons [Thu, 15 Nov 2018 18:20:30 +0000 (13:20 -0500)]
LU-8066 mdd: migrate from proc to sysfs

Move the ofd module from using proc for most single value files
to sysfs. The more complex proc entries are moved to debugfs.

Lustre-change: https://review.whamcloud.com/33632
Lustre-commit: e783bbffe35b2b8ebebde5bc70abf288d07df5a3

Change-Id: I01eebf1c58f1a13c2f5e8c599a1363c80468b0bd
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37005
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
4 years agoLU-1957 tests: remove sanity test 180 from ALWAYS_EXCEPT 30/36930/2
Andreas Dilger [Mon, 26 Aug 2019 23:00:44 +0000 (17:00 -0600)]
LU-1957 tests: remove sanity test 180 from ALWAYS_EXCEPT

Remove test_180 from sanity ALWAYS_EXCEPT, since it should have been
fixed by landing LU-2803.

Lustre-change: https://review.whamcloud.com/35930
Lustre-commit: 72b59b85a253e508ec1b192fbf8cad840ca6ff2c

Fixes: e99f38594d2b ("LU-2803 osd: osd-zfs to handle echo sequence (2) properly")
Test-Parameters: trivial testlist=sanity fstype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7601164865baba8fe2db3ce7bb33fd4c81eb0291
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36930
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12769 recovery: use monotonic timer 37/36937/2
Alex Zhuravlev [Mon, 23 Sep 2019 08:26:19 +0000 (11:26 +0300)]
LU-12769 recovery: use monotonic timer

instead of real one. also use absolute values for timer.

One of the reasons for the move from jiffies based timer
to a hrtimer timer was to avoid the issue of time drift.
It was discovered due to test failures with recovery on
VMs that the high resolution wall clock can drift as well.
Moving to the monotonic clock for the hrtimer avoids this
drift completely and it is safe to use since the recovery
timestamp is not shared between nodes.

Lustre-change: https://review.whamcloud.com/36274
Lustre-commit: 06408a4ef381121fa58783026a0cf0a6b0fa479c

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8b75121934c229dec8df7be0a4e69c1cda940d3f
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36937
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11762 ldlm: don't exceed hard timeout 36/36936/2
James Simmons [Thu, 4 Jul 2019 16:47:09 +0000 (12:47 -0400)]
LU-11762 ldlm: don't exceed hard timeout

For recovery lustre has both a soft timeout, obd_recovery_timeout
and a hard timeout, obd_recovery_time_hard. When the recovery
timer is adjust with the function extend_recovery_timer() you
can control if it takes in consideration what is left of the
timer. The current code is not very clear on its intent so this
patch attempts to make the code understandable. No function
change should happen with this patch.

Lustre-change: https://review.whamcloud.com/34408
Lustre-commit: 8bfe8939d810f5ac16484d3d4b81f829c7d7d0d7

Change-Id: I5701a6cd813ad64b6b4422863767af135eb8e94b
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36936
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11673 tests: quote argument of -n conf-sanity 28/36928/2
James Nunez [Thu, 1 Aug 2019 21:23:14 +0000 (15:23 -0600)]
LU-11673 tests: quote argument of -n conf-sanity

Inside the single bracket test function '[', the argument
of the ‘-n’ flag should be quoted arguments.  If the -n
argument is not quoted, a blank value will cause the
variable to disappear and this causes issues.  Quote the
argument or use [[ ]].

conf-sanity test 79 has two cases where the ‘-n’ argument
is not quoted.  Let's correct this.

Lustre-change: https://review.whamcloud.com/35669
Lustre-commit: 443cc6e51f0202b9bc40c256259c4fc14ae3f7af

Test-Parameters: trivial envdefinitions=ONLY=79 testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4b3a43de064d1992439dc25ecc7b0682520f74c9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11673 tests: quote argument of -n and test fix 27/36927/3
James Nunez [Fri, 2 Aug 2019 19:49:59 +0000 (13:49 -0600)]
LU-11673 tests: quote argument of -n and test fix

Inside the single bracket test function '[', the ‘-n’ flag
problems arise with unquoted arguments.  The -n argument
should be quoted or use double brackets for the test.

Quote the ‘-n’ argument in test-framework.sh functions.
This simple correction caused a few tests to fail.
Fix sanity test 65k to use the correct facets and check
for the mgs facet in convert_facet2label() to fix
replay-single test 58b.

Lustre-change: https://review.whamcloud.com/35080
Lustre-commit: 7e0cba246a7f2408c8266574a657e4459f691570

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I9655d2138c56c007207434f04b487b518bb3392e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36927
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12462 osc: layout and chunkbits alignment mismatch 77/36877/2
Vitaly Fertman [Thu, 8 Aug 2019 15:46:06 +0000 (18:46 +0300)]
LU-12462 osc: layout and chunkbits alignment mismatch

In the discard case, the OSC fsync/writeback code asserts
that each OSC extent is fully covered by the fsync request.

It may happen that a start(or an end) of a component does not match
the first (the last) osc object extent start (end), which is aligned
by the cl_chunkbits which depends on the OST block size.

The requirement for the component alignment is LOV_MIN_STRIPE_SIZE
which is 64K, the ZFS block size could be in MBs.

Use an aligned by chunk size the fsync reqion in the assertion.

Fixes: 092ecd6612 ("LU-12462 osc: Do not assert for first extent")

Lustre-change: https://review.whamcloud.com/35733
Lustre-commit: 7a9f7dec700c5c553396475daad272475f1b20be

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I2ff47fc87c838239142ffc63bebafce3e9403f4e
Cray-bug-id: LUS-7498
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36877
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12462 osc: Do not assert for first extent 76/36876/2
Patrick Farrell [Tue, 16 Jul 2019 16:28:25 +0000 (12:28 -0400)]
LU-12462 osc: Do not assert for first extent

In the discard case, the OSC fsync/writeback code asserts
that each OSC extent is fully covered by the fsync request.

This is not valid for the DOM case, because OSC extent
alignment requirements can create OSC extents which start
before the OST region of the layout (ie, they cross in to
the DOM region).  This is OK because the layout prevents
them from ever being used for i/o, but this same behavior
means that the OSC fsync start/end is aligned with the
layout, and so does not necessarily cover that first
extent.

The simplest solution is just to not assert on the first
extent.  (There is no way at the OSC layer to recognize the
DOM case.)

Lustre-change: https://review.whamcloud.com/35525
Lustre-commit: 092ecd66127eade284550b83192fa004ff55501b

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If66f8d81fb9dd4546a5647a10f6ca551e2cf98e3
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36876
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12899 build: rhel8 not install kernel-rpm-macros 39/37039/2
Qian Yingjin [Wed, 23 Oct 2019 01:43:24 +0000 (09:43 +0800)]
LU-12899 build: rhel8 not install kernel-rpm-macros

On RHEL8 kmodtool and kernel_module_package_buildreqs are not
installed with kernel-devel.

kernel_module_package_buildreqs is defined in kernel-rpm-marcos.
If kernel-rpm-macros is not installed, the Lustre RPM build will
report:
"Dependency tokens must begin with alpha-numeric, '_' or '/':
BuildRequires: %kernel_module_package_buildreqs"

This patch helps the developer understanding the detailed
information for the required packages when kernel-rpm-macros is
not installed.

Lustre-change: https://review.whamcloud.com/36557
Lustre-commit: 037840fb6b86d6083d55f3da5ad70d19d34cc5a5

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Id9b855eeac97d780d9c572d306da3c3a1fa95ea6
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37039
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12503 vvp_dev: increment *pos in .next 35/37035/2
NeilBrown [Sun, 11 Aug 2019 15:43:40 +0000 (11:43 -0400)]
LU-12503 vvp_dev: increment *pos in .next

As described in

Commit ec2e9995e4c5 ("lustre: llite: change how "dump_page_cache" walks a hash table")

The .next function should increment *pos. For some reason it
didn't, and this can trigger the warning in that function.

Lustre-change: https://review.whamcloud.com/35765
Lustre-commit: 02336a9a5d096dc9a603ed0e77e0c7cf7b41ffb3

Change-Id: If4ac748f455750d82712299b7915eb541a3ddc7e
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37035
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12503 llite: file write pos mimatch 34/37034/2
Bobi Jam [Wed, 27 Nov 2019 08:48:49 +0000 (16:48 +0800)]
LU-12503 llite: file write pos mimatch

In vvp_io_write_start(), after data were successfully written, but
for some reason (e.g. out of quota), the data does not or got
partially commited, so that the file's write position (kiocb->ki_pos)
would be pushed forward falsely, and in the next iteration of write
loop, it fails the assertion

ASSERTION( io->u.ci_rw.rw_iocb.ki_pos == range->cir_pos )

This patch corrects ki_pos if this scenario happens.

Lustre-change: https://review.whamcloud.com/36021
Lustre-commit: 1d2aa1513dc4e65813ad0bea138966a55244dbde

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib85b1a777da24cc935e5976beab2390052b4cec3
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12741 ptlrpc: do lu_env_refill for new request 36/37036/2
Mikhail Pershin [Fri, 8 Nov 2019 06:26:06 +0000 (09:26 +0300)]
LU-12741 ptlrpc: do lu_env_refill for new request

Perform lu_env_refill() prior any new request handling.
That was done already in tgt_request_handle() and is moved
now to ptlrpc_main() to work for any handler as well,
e.g. ldlm_cancel_handler()

Lustre-change: https://review.whamcloud.com/36714
Lustre-commit: 3f304b75d24aea0075415affa0c0bef004ef012c

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic5d8bfbd845f7e131849078c016f7e13b91d072f
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37036
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12894 sec: fix checksum for skpi 28/37028/2
Sebastien Buisson [Tue, 29 Oct 2019 09:32:22 +0000 (18:32 +0900)]
LU-12894 sec: fix checksum for skpi

Compute checkum on message before actually comparing
it to hmac value.

Add test to exercise all SSK flavors.
Make sure zconf_mount does include skpath mount option if SSK or
Kerberos is in use.

Lustre-change: https://review.whamcloud.com/36604
Lustre-commit: dcdf060342e7d69b64171840cf9475bf65d036ea

Fixes: a21c13d4df ("LU-8602 gss: Properly port gss to newer crypto api.")
Test-Parameters: envdefinitions=SHARED_KEY=true testlist=sanity-sec
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=skn testlist=sanity,recovery-small
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=ska testlist=sanity,recovery-small
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=ski testlist=sanity,recovery-small
Test-Parameters: envdefinitions=SHARED_KEY=true,SK_FLAVOR=skpi testlist=sanity,recovery-small
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7bcc3618c1824a0f0ca73219c7ac0ccc8405b946
Reviewed-on: https://review.whamcloud.com/37028
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-11981 lnet: clean up error message 01/37001/3
Amir Shehata [Thu, 12 Dec 2019 18:11:34 +0000 (10:11 -0800)]
LU-11981 lnet: clean up error message

There are instances when the message can be canceled. In this
case we do not want that to impact the interface health or output
an error message for it, as it could be noisy. Therefore, reduce
the message which logs this case from error to debug

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I586dbfcdcfa38994db99dc5983240b38c9ee2770
Reviewed-on: https://review.whamcloud.com/37001
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11997 ptlrpc: Properly swab ll_fiemap_info_key 81/36481/5
Oleg Drokin [Fri, 27 Sep 2019 14:23:18 +0000 (10:23 -0400)]
LU-11997 ptlrpc: Properly swab ll_fiemap_info_key

It was using lustre_swab_fiemap which is incorrect since the
structures don't match.

Added lustre_swab_fiemap_info_key that swabs embedded
obdo and ll_fiemap_info_key structures.

Lustre-change: https://review.whamcloud.com/36308
Lustre-commit: 2b905746ee3b5d9dbafcdb1af5930aea18120a7b

Change-Id: Ie701163bd4c2072a0461b2d9485bc184c6548f8f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36481
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12935 obdclass: fix import connect flag printing 81/36881/2
Andreas Dilger [Tue, 5 Nov 2019 03:25:22 +0000 (20:25 -0700)]
LU-12935 obdclass: fix import connect flag printing

The obd_connect_names[] array holds strings for the OBD_CONNECT_*
and obd_CONNECT2_* flag names.  It is positional, so every flag
bit needs a corresponding field in the array.

The "async_discard" feature was backported to b2_12, but the two
earlier features "pcc" and (now removed) "plain_layout" were not
backported.  Add in strings for those features, and fill in some
earlier "unknown" flag names as well

Fixes: e5810126b3fb ("LU-11359 mdt: fix mdt_dom_discard_data() timeouts")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I883d236262805361be3f48c533d781878f9494fa
Reviewed-on: https://review.whamcloud.com/36881
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event 68/36868/3
Wang Shilong [Thu, 7 Nov 2019 02:18:15 +0000 (10:18 +0800)]
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event

It looks like what's happening is when dm_dispatch_clone_request
dispatches the "clone" I/O request to the underlying (real) device
from the multipath device, the scsi driver can (often under load)
return BLK_MQ_RQ_QUEUE_DEV_BUSY. dm_dispatch_clone_request doesn't
have that as an exception the way it does BLK_MQ_RQ_QUEUE_BUSY and
so it calls dm_complete_request which propagates
the BLK_MQ_RQ_QUEUE_DEV_BUSY error code up the stack resulting
in multipath_end_io calling fail_path and failing the path because
there is an error value set.

Lustre-change: https://review.whamcloud.com/36699
Lustre-commit: 5c8b1e87a97bbe7b05f0b8325e98c16a0de1ff4c

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: If17ea5b3ab33a89a17d49e5dfb2e9f9f19371564
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9341 utils: fix lfs find for composite files 35/36935/3
Andreas Dilger [Thu, 25 Jul 2019 05:29:26 +0000 (23:29 -0600)]
LU-9341 utils: fix lfs find for composite files

Running "lfs getstripe -c" on a composite file returns the stripe
count of the last initialized component, but "lfs find -c N" does
not find this file because it was adding the total stripe_count
of all components.  "lfs find" should also check the stripe_count
of the last initialized component, as described in the man page.
Also use the last component stripe_size instead of any component.

Add a test case for the correct usage.

Lustre-change: https://review.whamcloud.com/35611
Lustre-commit: 72479a52be5f77f601d8234d957f5d6176edf6e8

Fixes: 5a76aee24476 ("LU-8998 lfs: user space tools for PFL")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1f0097aa002b29febcbf183cab02519b202540e5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36935
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11673 tests: add space before ']' in test-framework 26/36926/2
James Nunez [Thu, 6 Jun 2019 13:48:13 +0000 (07:48 -0600)]
LU-11673 tests: add space before ']' in test-framework

The test command '[' expects spaces before all arguments
including the closing ']'.

Add a space before the closing ']' in the function
print_summary() in test-framework.sh.

Lustre-change: https://review.whamcloud.com/35079
Lustre-commit: 54e011a729fd656ae8568192763afe12425cd05e

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: If2365cb5f2b9c003949c6224997644c61341fe35
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>