Whamcloud - gitweb
Andreas Dilger [Sat, 5 Dec 2015 01:35:00 +0000 (09:35 +0800)]
LU-5030 utils: add -R parameter to lctl get_param
To allow printing all parameters under a specified directory.
This is needed to replace hard-coded /proc pathnames in sanity-sec.sh
test_24, but would also be useful for normal usage.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I4391afce06fd63a87f556f7b95bd0cb2883ebbe5
Reviewed-on: http://review.whamcloud.com/17081
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Tue, 15 Dec 2015 18:11:24 +0000 (18:11 +0000)]
Revert "LU-6910 osp: add procfs values for OST reserved size"
This is causing LU-7550 and LU-7552 test failures in sanity.
This reverts commit
0585b0fb5895a24f07ca32e830d1fa72b75f4f2b.
Change-Id: Ic332a54ace4998acc4ba2ceab6f76ef733f85be5
Reviewed-on: http://review.whamcloud.com/17617
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Andreas Dilger [Fri, 11 Dec 2015 20:47:10 +0000 (13:47 -0700)]
LU-7381 e2fsck: update recommended e2fsprogs version
Update the recommended e2fsprogs version to 1.42.13.wc4.
This includes a number of important fixes to e2fsck, which
fix corruption problems in a number of cases:
LU-7381 e2fsck: fix e2fsck -fD directory truncation
- http://review.whamcloud.com/17153
LU-7267 e2fsck: remove duplicated ea value size check
- http://review.whamcloud.com/16779
LU-7368 e2fsck: skip quota update when interrupted
- http://review.whamcloud.com/17150
LU-7267 e2fsck: remove duplicated ea value size check
- http://review.whamcloud.com/16779/
LU-6722 jbd: double minimum journal size for RHEL7
- http://review.whamcloud.com/15401
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I1183394eec7717af1a163dbe7e4d93aa453ebbe5
Reviewed-on: http://review.whamcloud.com/17572
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Alex Zhuravlev [Mon, 9 Nov 2015 15:51:59 +0000 (18:51 +0300)]
LU-7053 osd: don't lookup object at insert
the idea is to cache FID->ino/type mapping in per-thread cache
at declaration/object creation. then insert can find that information
and don't lookup object in LU/OI. this should avoid potential deadlock
with lu_object_find() and iget(). also, this should improve performance
as in the majority of cases required data is filled locally by create.
stats collected for sanity-benchmark:
lustre-MDT0000: 448306 created, lookups: 8910 in OI, 8910 in FLD
meaning we have to lookup ino 448K times and only 9K times we had
to use OI, in 439K cases we found ino in the cache.
Change-Id: Ifa66c2d074f04e47d0d85b735f57dc506aa65f4c
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17092
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Alex Zhuravlev [Sat, 5 Dec 2015 05:57:40 +0000 (08:57 +0300)]
LU-7408 target: declare write for reply data
declare reply_data at max possible offset - this ensures
enough credits reserved.
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I08466452e1e95b803f316abae777a8c8f4a8626e
Reviewed-on: http://review.whamcloud.com/17086
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Wed, 14 Oct 2015 16:26:45 +0000 (00:26 +0800)]
LU-7144 tests: skip scrub/lfsck test under interoperation
Since the scrub/lfsck test are only for server side logic,
it is unnecessary to test scrub/lfsck under interoperation
mode, skip them.
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I044030b3bace787809d7cfd5622000b44a8be789
Reviewed-on: http://review.whamcloud.com/17520
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Tue, 17 Nov 2015 16:17:12 +0000 (08:17 -0800)]
LU-7450 osd: call commit_callback if no write updates
If it does not need write updates in some failure cases,
top_trans_stop should also call commit_callback to help
release the top_thandle in the commit list. Otherwise
it will stay in the commit list forever, as well as the
following top thandle, then update logs will be culmulated,
and cause long time recovery.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I1feaf0bd6d20f14dfabb4572f49818083e697dbb
Reviewed-on: http://review.whamcloud.com/17268
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Richard Henwood [Fri, 25 Sep 2015 22:05:51 +0000 (17:05 -0500)]
LU-7209 doc: more accurate documentation for obdfilter-survey
Make the the description of obdfilter-survey accurate and
precise.
Signed-off-by: Richard Henwood <richard.henwood@intel.com>
Change-Id: Icdd4adf53643e91dc8a2539f63977aae5fe28fe0
Reviewed-on: http://review.whamcloud.com/16646
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Omkar Kulkarni <omkar.kulkarni@intel.com>
Tested-by: Omkar Kulkarni <omkar.kulkarni@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Tue, 10 Nov 2015 15:31:55 +0000 (07:31 -0800)]
LU-7419 llog: lock new llog object creation
Lock the new llog object creation to avoid two
process create the same object at the same time.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Icdc0eec534ca2f15cd0e195df951416953195346
Reviewed-on: http://review.whamcloud.com/17132
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Amir Shehata [Fri, 27 Nov 2015 03:30:01 +0000 (19:30 -0800)]
LU-7475 lnet: ensure buffer config symmetry
When showing the configuration, make sure to add a buffers block
in the YAML output, if routing is configured, in order to allow
the same YAML block to be fed back for configuration to LNet.
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I3b269edf5b3688b500bbb3656c367bc82fff6b68
Reviewed-on: http://review.whamcloud.com/17370
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andrew Perepechko [Sat, 21 Nov 2015 13:55:24 +0000 (16:55 +0300)]
LU-6020 gss: properly map buffers to sg
A lot of buffer pointers passed to buf_to_sg() as input are coming
from vmalloc(), e.g. OBD_ALLOC_LARGE() in ptlrpc_add_rqs_to_pool().
sg_set_buf() uses virt_to_page() to map virtual addresses to
struct page, which does not work for vmalloc addresses.
The original code for buf_to_sg() caused the following crash:
BUG: unable to handle kernel paging request at
ffffeb040057c040
IP: [<
ffffffff81300367>] scatterwalk_pagedone+0x27/0x70
PGD 0
Oops: 0000 [#1] SMP
CPU 1
Pid: 2374, comm: ptlrpcd_3 Tainted: G O 3.6.10-030610-generic
RIP: 0010:[<
ffffffff81300367>] [<
ffffffff81300367>] scatterwalk_pagedone+0x27/0x70
RSP: 0018:
ffff8801a3c178a8 EFLAGS:
00010282
RAX:
ffffeb040057c040 RBX:
ffff8801a3c17938 RCX:
ffffeb040057c040
RDX:
0000000000000000 RSI:
0000000000000001 RDI:
ffff8801a3c17970
RBP:
ffff8801a3c178a8 R08:
00000000000005a8 R09:
ffff8801a3c17a40
R10:
ffff8801a30370d0 R11:
0000000000000a68 R12:
0000000000000010
R13:
ffff8801a3c17a08 R14:
ffff8801a3c17970 R15:
ffff88017d1c2c80
FS:
0000000000000000(0000) GS:
ffff8801afa40000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
ffffeb040057c040 CR3:
0000000001c0c000 CR4:
00000000001407e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process ptlrpcd_3 (pid: 2374, threadinfo
ffff8801a3c16000, task
ffff8801a44e0000)
Stack:
ffff8801a3c178b8 ffffffff813004bd ffff8801a3c17908 ffffffff8130303f
ffff880100000000 ffffffff00000000 ffff8801a3c17908 ffff8801a3c17b18
ffffc90015f015a8 0000000000000000 0000000000000010 0000000000000010
Call Trace:
[<
ffffffff813004bd>] scatterwalk_done+0x3d/0x50
[<
ffffffff8130303f>] blkcipher_walk_done+0x8f/0x230
[<
ffffffff8130a39f>] crypto_cbc_encrypt+0xff/0x190
[<
ffffffffa0688660>] ? aes_decrypt+0x80/0x80 [aesni_intel]
[<
ffffffffa0a1a1e4>] krb5_encrypt_bulk+0x164/0x5b0 [ptlrpc_gss]
[<
ffffffffa0a1a812>] gss_wrap_bulk_kerberos+0x1e2/0x490 [ptlrpc_gss]
[<
ffffffffa0a1600e>] lgss_wrap_bulk+0x2e/0x100 [ptlrpc_gss]
[<
ffffffffa0a0d98e>] gss_cli_ctx_wrap_bulk+0x44e/0x650 [ptlrpc_gss]
[<
ffffffffa0ab867c>] sptlrpc_cli_wrap_bulk+0x3c/0x70 [ptlrpc]
[<
ffffffffa0aba2d0>] sptlrpc_cli_wrap_request+0x60/0x360 [ptlrpc]
[<
ffffffffa0a8cde4>] ptl_send_rpc+0x164/0xc30 [ptlrpc]
[<
ffffffffa07be957>] ? libcfs_debug_msg+0x47/0x50 [libcfs]
[<
ffffffffa0a80ee0>] ptlrpc_send_new_req+0x3b0/0x940 [ptlrpc]
[<
ffffffffa0a86530>] ptlrpc_check_set+0x8e0/0x1d50 [ptlrpc]
[<
ffffffff816ac9f6>] ? schedule_timeout+0x146/0x260
[<
ffffffffa0ab0c9b>] ptlrpcd_check+0x4eb/0x5d0 [ptlrpc]
[<
ffffffffa0ab105f>] ptlrpcd+0x2df/0x420 [ptlrpc]
[<
ffffffff8108efa0>] ? try_to_wake_up+0x200/0x200
[<
ffffffffa0ab0d80>] ? ptlrpcd_check+0x5d0/0x5d0 [ptlrpc]
[<
ffffffff8107c5f3>] kthread+0x93/0xa0
[<
ffffffff816b8d04>] kernel_thread_helper+0x4/0x10
[<
ffffffff8107c560>] ? flush_kthread_worker+0xb0/0xb0
[<
ffffffff816b8d00>] ? gs_change+0x13/0x13
Change-Id: I346d50568b65ed10da2762ca34562fc2858a05d8
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Xyratex-bug-id: SNT-15
Reviewed-on: http://review.whamcloud.com/17319
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Jian Yu [Fri, 20 Nov 2015 19:13:18 +0000 (11:13 -0800)]
LU-5690 mount: fix lmd_parse() to handle commas in expr_list
The lmd_parse() function parses mount options with comma as
delimiter without considering commas in expr_list as follows
is a valid LNET nid range syntax:
<expr_list> :== '[' <range_expr> [ ',' <range_expr>] ']'
This patch fixes the above issue by using cfs_parse_nidlist()
to parse nid range list instead of using class_parse_nid_quiet()
to parse only one nid.
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I8ba6ee6eb31b4bb078a83d9db213cfca27b0fe66
Reviewed-on: http://review.whamcloud.com/17036
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Mikhail Pershin [Thu, 20 Aug 2015 08:28:06 +0000 (11:28 +0300)]
LU-6714 llog: test on-disk llog header values
llog_test_2():
- re-enable the disabled llog_open() test cases.
- Checks that llog_log_hdr values are written atomically
with the llog record addition/cancelling.
Patch contains also minor fixes:
- verify_handle() does header sanity checks at first then
checks amount of records against expected value.
- llog_test_3: rename test_3 static variables to show
that they are related to the test_3.
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Iedcf15c8f365f9c2021abae3325edcaf08efc4c9
Reviewed-on: http://review.whamcloud.com/16287
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Niu Yawei [Tue, 25 Aug 2015 03:07:34 +0000 (23:07 -0400)]
LU-7030 security: put imp_sec after all requests drained off
imp_sec should be put after all requests being drained off.
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I35f572fcc79b2bd1991db14577226a3ea735630d
Reviewed-on: http://review.whamcloud.com/16071
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Oleg Drokin [Wed, 9 Dec 2015 02:44:18 +0000 (21:44 -0500)]
LU-7530 mdt: Do not leak identity when no nodemap is present
It looks like sometimes nodemap structure on the export is not there
due to a race in old_init_ucred_common.
Move the nodemap check to the start not to leak identity reference
in such a case.
The bug was introduced in commit
2aea469a3a6e214d from LU-7199
Also silence the warning as there's nothing sysadmins could do when
it happens.
Change-Id: I5329ccb16201a71a263eb586e3a486b26ff238db
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17519
Tested-by: Jenkins
Reviewed-by: Kit Westneat <kit.westneat@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Alexander Boyko [Mon, 10 Aug 2015 11:40:28 +0000 (14:40 +0300)]
LU-6910 osp: add procfs values for OST reserved size
osp_pre_status=-ENOSPC is used to skip OST from object allocation.
The error was set when OST available space is less than 0.1% of total
OST size. This value is not configurable, so procfs files was
added:
reserved_mb_low - low watermark, if available space is less
than it, object allocation is stopped.
reserved_mb_high - highw watermark, if available space is more
than it, object allocation is enabled.
By default ~0.1% is reserved as low watermark. The high watermark
is twice bigger than the low by default.
High and low watermark could be changed by:
lctl set_param osp.lustre-OST0000-osc-MDT0000.reserved_mb_high=1024
When object allocation is disabled, a clients could appened to
existing files. And 0.1% is too low for them. For example, OST size
is 8TB, 0.1% is 8GB, if cluster has 1k clients, reserved space is
~8MB per client. The main reason of the patch is ability to increase
reserved space.
Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Xyratex-bug-id: MRP-2606
Change-Id: Ie48cc1a232f64aa7dc922000861004277fb47340
Reviewed-on: http://review.whamcloud.com/15731
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
John L. Hammond [Mon, 7 Dec 2015 17:40:54 +0000 (11:40 -0600)]
LU-7136 test: allow more time for copytools to stop
In sanity-hsm allow up to 200 seconds for the copytool to stop. This
is needed to prevent sporadic failures due to slow NFS (for the
copytool log file) delaying the termination of the copytool.
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Icafc4e9c5a00c849dcb479233826de058d2ede62
Reviewed-on: http://review.whamcloud.com/17499
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Nathaniel Clark [Thu, 25 Jun 2015 21:24:34 +0000 (17:24 -0400)]
LU-6767 osd-zfs: Track readonly status of ZFS
Return READONLY from osd_statfs() if underlying ZFS has been set to
READONLY, or if osd_ro() has been called. This adds a callback for
ZFS_PROP_READONLY for when it's changed.
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib7f35925904b1d93f9a457936585e9783635c849
Reviewed-on: http://review.whamcloud.com/15400
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Li Xi [Fri, 22 May 2015 04:07:00 +0000 (12:07 +0800)]
LU-6229 utils: fix lustre_rsync bug of cascade move
When replaying the changelog, destination files have to be
put into a special directory, if their parent directory is
possessing a different path other than the ultimate path
because of renaming. With the replaying process going on,
when the parent directory is being moved to the ultimate path,
the child files should be moved under the parent directory
which is called cascade move.
As long as a directory has child files under sepcial direcoty,
cascade move should happen, no matter the direcotry is being
renamed from sepcial direcoty or not. This patch fixes the problem
that cascade move is missing when the direcotry is being renamed
from ordinary path.
Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I2d21604b81fe0cf08df1af2bfccc90a32986bf05
Reviewed-on: http://review.whamcloud.com/14914
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Chennaiah Palla [Thu, 3 Dec 2015 10:57:16 +0000 (16:27 +0530)]
LU-7515 obdclass: add export for lprocfs_stats_alloc_one()
When compiling Lustre without optimization, when using GCOV,
the lprocfs_stats_alloc_one() symbol is not properly exported to other modules
and causes the ptlrpc module to fail loading with an unknown symbol.
Added EXPORT_SYMBOL(lprocfs_stats_alloc_one) so that this works properly.
Seagate-bug-id: MRP-3188
Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Change-Id: I8ef02a0e0bf519fa93f85cb162a6340e3feeb736
Reviewed-on: http://review.whamcloud.com/17443
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Lai Siyao [Tue, 13 Oct 2015 09:22:40 +0000 (17:22 +0800)]
LU-3569 utils: remove ll_recover_lost_found_obj
remove obsolete tool ll_recover_lost_found_obj.
Signed-off-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I5d5f33f5c9d68bb1f05d7ab0da6fb2986e873501
Reviewed-on: http://review.whamcloud.com/16477
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bobi Jam [Fri, 13 Nov 2015 10:48:53 +0000 (18:48 +0800)]
LU-7446 clio: lov_io_init() should return error code
lov_io_init_empty/release() should returns error code instead of
true on error case.
Fault IO need handle restart in the case of accessing HSM released
file.
Add a test case.
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I4953c12c1e9b82a16aed9b8b1e3fe6e38d783b24
Reviewed-on: http://review.whamcloud.com/17240
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Hiroya Nozaki [Tue, 16 Jun 2015 05:41:00 +0000 (14:41 +0900)]
LU-6732 llite: ll_write_begin/end not passing on errors
Because of a implementation of generic_perform_write(), write(2)
may return 0 with no errno even if EDQUOT or ENOSPC actually
happend in it.
This patch fixes the issue with setting a proper errno to
ci_result and get it in ll_file_io_generic.
Signed-off-by: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com>
Change-Id: I3fc986b57d703ad5fbf41e1ea8182d2d561e8005
Reviewed-on: http://review.whamcloud.com/15302
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Thu, 25 Apr 2013 04:23:57 +0000 (22:23 -0600)]
LU-1606 misc: clean up DFID related error messages
Improve the error messages related to DFID output and parsing left
over from removal of LPU64/LPX64 usage in userspace.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I4b4fcb3cc389b8d8ec4375fa92bfee9b353ebbe5
Reviewed-on: http://review.whamcloud.com/6156
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
James Nunez [Tue, 5 May 2015 17:53:30 +0000 (11:53 -0600)]
LU-2524 test: Clean up sanity-quota
Conduct miscellaneous cleanup to sanity-quota including:
Removing the `-p` (parents) option from many calls to mkdir
Replace `lfs setstripe` with $SETSTRIPE
Added check for and call to `error` with error messages for a variety
of common routines ,like mkdir, or for functions that return a value.
Replace `…` with $(...)
Removed linefeed escape after |, ||, & and && operators.
Modified parameters in test 4b so that the --inode-grace value exceeds
the valid range.
Removed unused variables
Removed $ from variables inside $(())
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iadedea0bd0a0f85235e0bb908ee0a6ed36503eb3
Reviewed-on: http://review.whamcloud.com/14680
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Oleg Drokin [Mon, 14 Sep 2015 04:18:46 +0000 (00:18 -0400)]
LU-7148 osc: Remove remains of osc_ast_guard
osc_ast_guard has been removed by the clio simplification.
Remove the extern declaartion and lock class definition.
Change-Id: Ibcf14e7aebe1dab8b586d3cd8d81560f6d3dcc81
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/16392
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Niu Yawei [Thu, 8 Oct 2015 02:14:56 +0000 (22:14 -0400)]
LU-5951 ptlrpc: track unreplied requests
The request xid was used to make sure the ost object timestamps
being updated by the out of order setattr/punch/write requests
properly. However, this mechanism is broken by the multiple rcvd
slot feature, where we deferred the xid assignment from request
packing to request sending.
This patch moved back the xid assignment to request packing, and
the manner of finding lowest unreplied xid is changed from scan
sending & delay list to scan a unreplied requests list.
This patch also skipped packing the known replied XID in connect
and disconnect request, so that we can make sure the known replied
XID is increased only on both server & client side.
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ic98e1599085871c0ac08d28609a044c79d5af75d
Reviewed-on: http://review.whamcloud.com/16759
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Grégoire Pichon <gregoire.pichon@bull.net>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Jeremy Filizetti [Tue, 1 Dec 2015 19:54:10 +0000 (14:54 -0500)]
LU-7508 ldlm: Don't check opcode with NULL rq_reqmsg
When GSS is enabled it's possible to have a NULL rq_reqmsg
if a bad signature or no context is returned during the unwrap
of the request. Don't check the opcode in this case.
Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Change-Id: I3a74dff7638b318190c5c4ad73acbe7ec299aa80
Reviewed-on: http://review.whamcloud.com/17414
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Fan Yong [Thu, 17 Sep 2015 16:40:03 +0000 (00:40 +0800)]
LU-7268 scrub: NOT assign LMA for EA inode
Originally, when OI scrub scans the device, if the target inode has
no FID-in-LMA EA, then it will generate an IGIF mode FID and store
it in the LMA EA. Such behavior is not suitable if the target inode
is used for large EA. The OI scrub should skip the EA inode that is
marked as "LDISKFS_EA_INODE_FL".
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I52b05b864ef8a2797a2f3dda0f80f95227809c34
Reviewed-on: http://review.whamcloud.com/17043
Tested-by: Jenkins
Reviewed-by: Kalpak Shah <kalpak.shah@seagate.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
John L. Hammond [Wed, 25 Feb 2015 22:12:49 +0000 (16:12 -0600)]
LU-6298 hsm: shutdown HSM CDTs in parallel
In sanity-hsm.sh rewrite copytool_cleanup() to shutdown and restart
the MDT HSM coordinators in parallel. This saves about 8 * (MDSCOUNT -
1) seconds per call.
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I75445ad126dc73251a3d056611133e3ab6b83362
Reviewed-on: http://review.whamcloud.com/13901
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bruno Faccini [Fri, 20 Nov 2015 14:38:11 +0000 (15:38 +0100)]
LU-5921 tests: enhance server target mount race testing
This patch is a follow on to LU-5299 to strengthen and enhance
concurrent server target mount race testing.
It uses OBD_RACE() feature to better set a concurrent/racy
situation, and also allow to handle all mount errors instead
of only EALREADY.
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I16a94e5aa046e15096d2e55d57e22899a93fa03f
Reviewed-on: http://review.whamcloud.com/17302
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Tue, 3 Nov 2015 15:32:13 +0000 (07:32 -0800)]
LU-7383 mdt: retry for busy lock during migration
In migration, if the lock of the migrating object
is being cached on other node, it should revoke
the lock and retry, instead of return -EBUSY.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I1317681a892b9a21f2c78d7696ca6f94d43bd9bc
Reviewed-on: http://review.whamcloud.com/17048
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Oleg Drokin [Tue, 8 Dec 2015 04:42:39 +0000 (23:42 -0500)]
New tag 2.7.64
Change-Id: I79fb95af8bc9e979edf3214315219f786eb12599
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Bob Glossman [Fri, 4 Dec 2015 19:50:15 +0000 (11:50 -0800)]
LU-7428 test: disable conf-sanity, test_84
Add failing test to ALWAYS_EXCEPT.
This is a temprorary workaround until a real
fix for the test failure is developed.
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I73e00d658a9e7728ce52b5dc90741e9a18ce15f9
Reviewed-on: http://review.whamcloud.com/17482
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Emoly Liu [Mon, 23 Nov 2015 07:30:10 +0000 (15:30 +0800)]
LU-7437 lctl: list_param -R can't work correctly
We shouldn't call lprocfs_param_pattern() inside listparam_display(),
otherwise it will add the prefix "/proc/{fs,sys}/{lnet,lustre}" each
time, so that the parameters can be listed recursively. The similar
issue in {set/get}param is fixed as well.
Also, this patch adds sanity.sh test_401 to verify this function.
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I21acec364cdbdfc025979153f66a87d44c9136e8
Reviewed-on: http://review.whamcloud.com/17223
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Olaf Faaland [Thu, 22 Oct 2015 20:31:05 +0000 (13:31 -0700)]
LU-7297 osd-zfs: initialize oh_lock
The ZFS osd was not initializing od_brw_stats.hist[].oh_lock.
This rectifies that.
Change-Id: I3f637b73c77908c2297bfab97e33eca63b0d5986
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: http://review.whamcloud.com/16919
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Wang Shilong [Sat, 11 Jul 2015 03:49:55 +0000 (11:49 +0800)]
LU-1026 ldiskfs: make bitmaps corruption not fatal
We still hit bitmaps problems for rhel6 series kernel,
corruptions happen because ext4_mb_check_ondisk_bitmap()
check failed and FS become RO again:
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group
294corrupted: 20180 blocks free in bitmap, 20181 - in gd
Aborting journal on device dm-6-8.
LDISKFS-fs (dm-6): Remounting filesystem read-only
ldiskfs_mb_new_blocks: Updating bitmap error: [err -30]
[pa
ffff880d9d6e4d68] [phy
14974976] [logic 8192] [len 3072]
[free 3072] [error 1] [inode 278678]
ldiskfs_ext_new_extent_cb: Journal has aborted
this might be caused by some ext4 internal bugs, this patch
did the following things:
1.Inside ext4_read_block_bitmap() have gaven reasons
why it failed, so caller don't need call ext4_error() again.
2. mark block group corrupt and use ext4_warning() instead
of ext4_error().
There are still some bitmaps corruption places not handling,
let's keep it for now, and if it really hurt, let's add the
same handling codes logic later.
Tested by following scripts:
TEST_DEV="/dev/sdb"
TEST_MNT="/mnt/ext4"
mkdir -p $TEST_MNT
mkfs.ext4 -F $TEST_DEV >&/dev/null
mount -t ldiskfs $TEST_DEV $TEST_MNT
dd if=/dev/zero of=$TEST_MNT/largefile
oflag=direct bs=
10485760 count=200
umount $TEST_MNT
dd if=/dev/zero of=$TEST_DEV bs=4096 seek=641
count=10 oflag=direct
mount -t ldiskfs $TEST_DEV $TEST_MNT
rm -f $TEST_MNT/largefile
dd if=/dev/zero of=$TEST_MNT/largefile oflag=direct
bs=
10485760 count=200 && echo
"FILESYSTEM still usable after bitmaps corrupts happen"
dmesg | tail
umount $TEST_MNT
e2fsck $TEST_DEV -y
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Iabb6ebf719d80d9ba4f41bee0b237e304212832b
Reviewed-on: http://review.whamcloud.com/16679
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Sat, 3 Oct 2015 01:45:32 +0000 (09:45 +0800)]
LU-7315 osd-ldiskfs: handle pdo lock properly
Inside the osd_dirent_check_repair(), if the logic comes to
"goto again", it only unlock the "hlock" but without seting
the variable @hlock as NULL. Althouth it will not cause any
logic failure, it may make the readers to be confused. This
patch will set "hlock = NULL;" explicitly to avoid trouble.
On the other hand, inside ldiskfs, the pdo lock users need
to check whether the lock handler is NULL or not properly.
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9db9dc758a2976849c299f76e06723e796da235d
Reviewed-on: http://review.whamcloud.com/16924
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Jinshan Xiong [Mon, 1 Jun 2015 18:08:07 +0000 (11:08 -0700)]
LU-6856 zfs: handle non existing file in osd_object_ref_del
Remove false assertion in zfs:osd_object_ref_del() because this
may be in the cleanning path of error handling.
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ib7b9d80816bdab7f68b36a33e95140ea7f3eae8c
Reviewed-on: http://review.whamcloud.com/15611
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Hongchao Zhang [Mon, 16 Nov 2015 02:33:51 +0000 (10:33 +0800)]
LU-6802 ptlrpc: reset imp_replay_cursor
At client side, the replay cursor using to speed up the lookup
of committed open requests in its obd_import should be resetted
for normal connection (not reconnection) during recovery.
Change-Id: I68816780a5d79053d9109cb68ae1c3b8ea13ede8
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/17351
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Sun, 4 Oct 2015 18:26:26 +0000 (11:26 -0700)]
LU-6693 out: not return NULL in object_update_param_get
Return ERR_PTR in object_update_param_get() for all cases to
avoid unnecessary confusion to callers.
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Idfcc19d99bbf308759481b3d60d95341745d19e8
Reviewed-on: http://review.whamcloud.com/16417
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Nathaniel Clark [Mon, 19 Oct 2015 18:13:49 +0000 (14:13 -0400)]
LU-7316 build: Update ZFS/SPL version to 0.6.5.3
Bug Fixes
* Fix CPU hotplug zfsonlinux/spl#482
* Disable dynamic taskqs by default to avoid deadlock
zfsonlinux/spl#484
* Don't import all visible pools in zfs-import init script
zfsonlinux/zfs#3777
* Fix use-after-free in vdev_disk_physio_completion
zfsonlinux/zfs#3920
* Fix avl_is_empty(&dn->dn_dbufs) assertion zfsonlinux/zfs#3865
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I36347630be2506bee4ff0a05f1b236ba2ba7a0ae
Reviewed-on: http://review.whamcloud.com/16877
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Al Viro [Thu, 26 Nov 2015 15:10:02 +0000 (10:10 -0500)]
LU-4423 lnet: don't use iovec instead of kvec
Replace struct iovec with struct kvec.
Linux commit:
f351bad2b4b4bb74810ad4f127f6602e2d2ae403
Change-Id: Ib7bb49069e42ca82d66a149617361c73ee4d710d
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: http://review.whamcloud.com/17205
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Sat, 21 Nov 2015 15:33:29 +0000 (07:33 -0800)]
LU-7462 mdd: check object existence
Check object existence in mdd_is_parent()
and _mdd_lookup(), so the following retrieving
attributes will not panic if object does not
exist.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Icae243949e202f9d6a4d38b9823373101a093c74
Reviewed-on: http://review.whamcloud.com/17323
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
Bob Glossman [Mon, 2 Nov 2015 23:39:37 +0000 (15:39 -0800)]
LU-7375 lbuild: add missing case to lbuild
Add the missing case of el6.7 to autodetect_target
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I6a8032eb03b2fb3002798aa49ee585879cf711c3
Reviewed-on: http://review.whamcloud.com/17024
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bob Glossman [Tue, 24 Nov 2015 19:29:43 +0000 (11:29 -0800)]
LU-7103 test: avoid cat of /dev/urandom
Use a different command to generate a random number.
Using cat /dev/urandom doesn't work in all cases.
Test-Parameters: clientdistro=el7
testlist=ost-pools,ost-pools,ost-pools,ost-pools,ost-pools,ost-pools,ost-pools,
ost-pools,ost-pools,ost-pools,ost-pools,ost-pools,ost-pools,ost-pools
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I63cfad2654cf654460263529457538965e373f82
Reviewed-on: http://review.whamcloud.com/17350
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Sat, 3 Oct 2015 07:49:46 +0000 (15:49 +0800)]
LU-7447 lfsck: correct nlink attr for new created dir
For new created directory object, there are three main operations:
1) insert dot entry
2) insert dotdot entry
3) ref_add on the directory object.
Sometimes, the 3rd step maybe ahead of 1st, sometimes maybe between
1st and 2nd. Usually, the developers would think that such order is
not important. But in fact, such assumption isn't true, because the
ldiskfs will set the new created directory object's nlink attr as 2
when inserting dotdot entry. Then if ref_add() is called after ".."
inserted, the empty directory object's nlink attr is 3.
To avoid above trouble and make the OSD API to be order-independent,
we will make the osd-ldiskfs to backup the new created dir object's
nlink attr before calling ldiskfs_add_dot_dotdot(), then restore it
after ldiskfs_add_dot_dotdot() called.
On the other hand, this patch also adds some missed logic for lfsck
declaring to insert dot/dotdot entries, remove some redundant logic.
Add e2fsck check for on-disk consistency verification after LFSCK.
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I065f683388417f3e23d47471c065419bd9bfcb19
Reviewed-on: http://review.whamcloud.com/17269
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Li Xi [Fri, 6 Nov 2015 09:06:08 +0000 (17:06 +0800)]
LU-7371 test: wrong read length over isize
This patch adds tests to check read length is correct if reading
a file of size 4095.
Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I0c0f6b378fa4af053ed54f2f5dea2418191a7b69
Reviewed-on: http://review.whamcloud.com/17060
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Amir Shehata [Fri, 24 Jul 2015 21:14:38 +0000 (14:14 -0700)]
LU-6851 lnet: Ignore hops if not explicitly set
Since the # of hops is not a mandatory parameter the LU-6060
patch will cause problems to already existing systems since it
changes the behavior by which a route is determined down.
To fix this case the # of hops now defaults to LNET_UNDEFINED_HOPS
if no hop count is specified.
LNET_UNDEFINED_HOPS is defined to ((__u32)-1). When it's printed as
%d, it displays as -1.
__u32 is used through out the call stack for hop count to explicitly
define the size of the hop count and to avoid any sizing issues when
passing data to and from the kernel.
To keep existing behavior both lnet_compare_routes() and LNetDist()
will treat undefined hop count as hop count 1.
When executing the logic in lnet_parse_rc_info() there is no
longer an assumption that the default hop count is 1. If
the hop count is 1 then it must've been explicitly set by
the user.
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I1a28a35a4edc2437cf95cb9d455e59c8102736fa
Reviewed-on: http://review.whamcloud.com/15719
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Alex Zhuravlev [Fri, 6 Nov 2015 07:20:53 +0000 (10:20 +0300)]
LU-7400 lod: register stop callbacks at create
In some cases we stop just created transaction (i.e. it's
empty, not started), but then top_trans_stop() waits
indefinitely for stop callbacks which were supposed to be
register at top_trans_start(). instead register them at
top_trans_create().
Instead of register the commit callback in top_trans_start(),
we should register the commit callback once the top thandle
is added into commit list, because in some error cases,
top_trans_start() might not be called, then the top thandle
will stay in the commit list forever, and also blocking
llog cancellation.
Change-Id: I1ca528e2f8f5e4d9cf6b5dd484653a055e17cc6c
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: wang di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/17059
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Kit Westneat [Tue, 13 Oct 2015 02:58:18 +0000 (22:58 -0400)]
LU-7199 nodemap: assign nodemap to export before connecting
This patch moves the nodemap assignment to be before the connection
is active, and the nodemap deassignemnt to be after the connection
is made inactive. It also checks for null nodemaps and returns
-EACCES if the nodemap is not set in order to avoid a kernel panic.
Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Id4dd02cf5b4208bda6d6f17aed6adb13b18c7731
Reviewed-on: http://review.whamcloud.com/16802
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Fri, 27 Nov 2015 07:29:04 +0000 (00:29 -0700)]
LU-7428 tests: write superblock in conf-sanity test_84
In conf-sanity.sh test_84() a newly-formatted MDS filesystem's block
device is set read-only immediately after mount via replay_barrier(),
which may result in initial formatting or configs to be discarded,
resulting in a wide variety of different failure modes for this test.
Ensure that the superblock and other configuration logs are flushed
to disk before replay_barrier() is called, so that the MDS can mount
properly again later in the test.
Test-Parameters: testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity
Test-Parameters: testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I95894b600ed7596c2014cba4f35fef4b443ebbe5
Reviewed-on: http://review.whamcloud.com/17371
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bruno Faccini [Thu, 5 Nov 2015 15:44:52 +0000 (16:44 +0100)]
LU-7329 obdclass: sync device to flush journal callbacks
After llog_test_10() has been added to check correct Catalog
wrap-around as part of LU-6556, frequent auto-tests failures
have been encountered due to OOM situation on VMs caused by
a lot of journal commit callbacks backlog induced by the huge
LLOG activity generated by new test.
This patch forces frequent device sync to flush these journal
commit callbacks.
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: If06914a395f5bacbbb55fcdc229ffcd47d742843
Reviewed-on: http://review.whamcloud.com/17052
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Yang Sheng [Tue, 13 Oct 2015 07:43:27 +0000 (15:43 +0800)]
LU-7098 osd-ldiskfs: don't alloc inode directly
We should alloc ldiskfs_inode_info instead alloc inode
directly. Else will overflow in follow ldiskfs_add_entry.
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ib88f847ea8f160f6f0d0c7e7e680247d7ec96fb6
Reviewed-on: http://review.whamcloud.com/16804
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Andreas Dilger [Fri, 6 Nov 2015 21:50:01 +0000 (14:50 -0700)]
LU-1095 mdc: remove console spew from mdc_ioc_fid2path
In some cases with a very long pathname, such as with sanity.sh
test_154c, mdc_ioc_fid2path() would spew long debug messages to
the log, because libcfs_debug_vmsg2() refuses to log messages over
one page in size.
Truncate the debug message to only log the last 512 characters
of the pathname, which is sufficient for most debugging, saves a
bit of space in the debug log, and will prevent the debug logging
from printing to the console in the first place.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3cad19d02e8065574b8baf5694e9894e43112318
Reviewed-on: http://review.whamcloud.com/17078
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Thu, 5 Nov 2015 10:25:52 +0000 (02:25 -0800)]
LU-7396 llite: check request != NULL in ll_migrate
Check if the request is NULL, before retrieve reply body
from the request.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Ifec9caf270b938b7583de0315610f930fa52649d
Reviewed-on: http://review.whamcloud.com/17079
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Andreas Dilger [Fri, 9 Oct 2015 19:10:09 +0000 (13:10 -0600)]
LU-7276 utils: make llog_reader consistent with kernel
Handle the CM_SKIP records in the same manner as the kernel,
by printing "SKIP" only until the first CM_END record, rather
until the first CM_END|CM_SKIP record.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie0c35ecd9d74a1e0b378bcf6642bca32e4dfa35e
Reviewed-on: http://review.whamcloud.com/16788
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
wang di [Thu, 10 Sep 2015 04:34:47 +0000 (21:34 -0700)]
LU-7068 mdd: object leak in mdd_migrate_entries
mdd_migrate_entries should release the object
if mdd_trans_start() fails.
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: If10dce8b450e8c19bb93465beb08118e98c4ed96
Reviewed-on: http://review.whamcloud.com/16384
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Jinshan Xiong [Thu, 2 Jul 2015 07:30:55 +0000 (15:30 +0800)]
LU-6666 osc: Do not merge extents with partial pages
After range lock is introduced to Lustre, it's possible for
multiple threads to submit osc_extents with partial pages, and
finally I/O engine may try to merge these extents, which will
end up with assert in osc_build_rpc().
In this patch, osc_extent::oe_no_merge is introduced, and this flag
is set if osc_extent submitted via osc_io_submit() includes partial
pages. This flag is used by I/O engine to stop merging this kind
of extents.
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I0c851912beb0370553bc2e2b05f80dee175a2f1f
Reviewed-on: http://review.whamcloud.com/15468
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Sat, 21 Nov 2015 15:42:27 +0000 (07:42 -0800)]
LU-7463 osd: Change existence assert into error
In osd_declare_xx(), some objects might not existent,
especially when calling out_handler(). Let's change
these assert into -ENOENT error to avoid panic on
MDS.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: If17a013a6939a1ebe0519406d39e405fd915110f
Reviewed-on: http://review.whamcloud.com/17324
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Di Wang [Sat, 21 Nov 2015 15:16:28 +0000 (07:16 -0800)]
LU-7461 lod: retry to get remote update log
If the remote MDT is also in recovery status,
then retrieving update logs in lod_sub_recovery_thread()
might return -EAGAIN or -EIO or -EBUSY, let's
retry in this case until the recovery is aborted or
the local MDT is umounted.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Iee945942bd01925cdcfe75c4e59dccbd63b34498
Reviewed-on: http://review.whamcloud.com/17322
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Tue, 10 Nov 2015 10:22:20 +0000 (02:22 -0800)]
LU-7416 osp: check rq_repmsg in osp_request_commit_cb
Check if rq_repmsg is NULL before retrieving
last committed transno from reply message.
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Ibf1e110e33df333934c65dfcf52870954e936180
Reviewed-on: http://review.whamcloud.com/17130
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Fri, 2 Oct 2015 05:21:21 +0000 (13:21 +0800)]
LU-7343 osd-ldiskfs: handle ldiskfs_append failure
In new linux kernel (linux-3.1x, x>=0), the ldiskfs exported
function ldiskfs_append() return error# via the return value,
instead of via the output parameter @err as it does on other
kernels (linux-2.6). Under such case, the caller should not
assume non-NULL returned value is valid buffer head, it can
stands for error#. So check that properly.
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4dca43bcfd31aafd999f54934a51d258071dab22
Reviewed-on: http://review.whamcloud.com/17148
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Alex Zhuravlev [Tue, 3 Nov 2015 12:05:52 +0000 (15:05 +0300)]
LU-7376 tests: sanity-hsm/59 should skip old servers.
there is no poin to crash vulnerable versions.
Change-Id: Iacafd10d2a3d04ba1bb9ca70d8e343809490a349
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17029
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Alex Zhuravlev [Tue, 17 Nov 2015 06:39:25 +0000 (09:39 +0300)]
LU-7436 tests: skip conf-sanity/91 with old servers
due to missing functionality.
Change-Id: I2be72820413aee9a9d7082c2d8c6f308eeb6e141
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17222
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bob Glossman [Tue, 10 Nov 2015 16:03:49 +0000 (08:03 -0800)]
LU-7415 kernel: kernel update RHEL 6.7 [2.6.32-573.8.1.el6]
Update RHEL6.7 kernel to 2.6.32-573.8.1.el6
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I1b3c4b144b06e5e96f818e35c08f490e574ed798
Reviewed-on: http://review.whamcloud.com/17119
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Jinshan Xiong [Tue, 15 Sep 2015 19:19:10 +0000 (12:19 -0700)]
LU-7164 osc: osc_extent should hold refcount to osc_object
To avoid a race that osc_extent and osc_object destroy happens on the
same time, which causes kernel crash.
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I3e3237f0d1cff4bd992bef4e4c01355a1d5c8d9f
Reviewed-on: http://review.whamcloud.com/16433
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Fri, 18 Sep 2015 07:52:31 +0000 (15:52 +0800)]
LU-7384 lfsck: check transaction stop status
The LFSCK modification will be sent to remote server when the
transaction stop, for sync transaction case, we can check the
dt_trans_stop() result.
If the lfsck_namespace_create_orphan_dir() failed, but we may
ignored that before because of ignoring dt_trans_stop result.
Then it may cause subsequent lfsck_namespace_insert_normal()
failed at LASSERT(lu_object_exists(o) != 0);
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: If897b7bd479ecdb61e6435f3177211f865a4e303
Reviewed-on: http://review.whamcloud.com/17042
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Wed, 23 Sep 2015 15:24:02 +0000 (23:24 +0800)]
LU-3536 lfsck: reuse parameter name for re-locating object
Usually, LFSCK engine will locate the object against the bottom
device (OSD), then make related check/repair directly. Sometimes,
such as lfsck_namespace_repair_dirent(), we need to modify based
on LOD device. Under such case, the LFSCK will re-locate related
object with the same FID.
Originally, there is no special rules about the parameter's name,
that is confused which one should be used. For example, the input
parameter is named as "parent" that is against OSD, we need to
re-locate the obj based on the LOD, named as "pobj", then in the
subsequent logic, "pobj" should be used, but unfortunately, the
"parent" may be used by wrong. It is difficult to find out such
invalid usage.
To avoid such trouble, we prefer to reuse the (input) parameter
name after re-locating the object, name "pobj" as "parent".
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I5b6d7c5c10e1817ef2bade4931485228b26c511d
Reviewed-on: http://review.whamcloud.com/16932
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Amir Shehata [Fri, 6 Nov 2015 20:41:01 +0000 (12:41 -0800)]
LU-3322 lnet: make connect parameters persistent
Store map-on-demand and peertx credits in the peer, since the peer
is persistent. Also made sure that when assigning the parameters
received on the connection to the peer structure through create,
that if another peer is added before grabbing the lock we assign
these parameters to it as well.
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ie68f1ba1349d15b0a31eff9a2ca454df8e408ea9
Reviewed-on: http://review.whamcloud.com/17074
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Liang Zhen [Fri, 6 Nov 2015 14:23:05 +0000 (22:23 +0800)]
LU-7324 lnet: recv could access freed message
When lnet_parse_put calls lnet_ptl_match_md, this function can attach
current message on the delayed list if there is no match. It means
this message can be taken over and freed by another thread who is
posting new MD, then it is not safe for caller of lnet_parse_put to
check this message again.
This patch fixes this issue by adding a local variable "ready_delay"
to store corresponding status of lnet_msg, so lnet doesn't need to
check the message again if lnet_ptl_match_md returned MATCH_NONE for
it.
Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I0f8827103dd637648112e936ce6e685266e5ca40
Reviewed-on: http://review.whamcloud.com/17065
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bruno Faccini [Mon, 26 Oct 2015 13:37:19 +0000 (14:37 +0100)]
LU-7221 ldlm: do not take a reference on target if stopping
In the set of changes of patch for LU-5569
(http://review.whamcloud.com/11750/,
commit
892078e3b566c04471e7dcf2c28e66f2f3584f93) one is to take a
reference on target even if it is stopping (umount'ed). Then, upon
connections attempts, this can lead to unwanted cleanup actions to
occur on [obd_self_]export from class_decref(), finally causing a
LBUG in class_export_put() because export's exp_refcount has already
reached 0.
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: If960437934fb694d173a4fd1fbfb9e43d496fea6
Reviewed-on: http://review.whamcloud.com/16940
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Alex Zhuravlev [Tue, 20 Oct 2015 13:53:18 +0000 (16:53 +0300)]
LU-7318 out: dynamic reply size
every update on the initiator side can declare how many bytes
it expects back. OUT packing library put these numbers on the
wire and prepary an appropriate buffer for the reply. then OUT
target do few checks to ensure individual replies fit their
buffers.
Change-Id: I443b5c879bc321c33efb70af665ecd2b2f7baa18
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/16889
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bob Glossman [Thu, 17 Sep 2015 19:05:02 +0000 (12:05 -0700)]
LU-7077 target: avoid using possible error return NULL pointer
previous fix http://review.whamcloud.com/15576 added a call
to cfs_hash_getref(). add LASSERT() to ensure the can
never happen here return value of NULL is in fact never seen.
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Ic6132b5450534db0bb9b89c3dd6f55517450c42a
Reviewed-on: http://review.whamcloud.com/16473
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
James Simmons [Thu, 12 Nov 2015 14:31:49 +0000 (09:31 -0500)]
LU-7174 build: make git ignore dkms generated file
While testing patches other non-patch the related build
by product dkms.mkconf show up with git status. To avoid
adding this by accident place thes by product files in the
proper .gitignore files.
Change-Id: I49a5411f8c1159a75d1cd28067dfbf3c2a677d6c
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17136
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Thu, 24 Sep 2015 09:04:41 +0000 (17:04 +0800)]
LU-7169 tests: check disk corruption during failover
It is a debug patch for conf-sanity test_84. It is suspected
that there is some disk corruption during the MDT0 failover.
Test-Parameters: mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs testlist=conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I7e20f26e1ecee483474ace44c8284b5776f3c602
Reviewed-on: http://review.whamcloud.com/16664
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Nathaniel Clark [Fri, 6 Nov 2015 18:48:34 +0000 (13:48 -0500)]
LU-7377 utils: Don't fail on plugin load for mount/tunefs
While loading mount_utils_zfs, if zfs modules aren't loaded, but
zfs plugin is present, it will return an error, this shouldn't cause
all module loading to fail.
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Idad52745cdfa9d673ab9bd4afe38de4d51ae9a49
Reviewed-on: http://review.whamcloud.com/17128
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Di Wang [Tue, 10 Nov 2015 10:12:00 +0000 (02:12 -0800)]
LU-7414 target: do not share update and rdbuf
Redefine lu_rdbuf structure to simplify the rdbuf
allocation in out_read().
And also move tti_u.rdbuf out of tgt_thread_info
union, otherwise rdbuf and update will share
the same memory and cause corruption, see out_read().
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Idb0f5af1b00fd5fd15ebc8742aa60d9a43df0a8a
Reviewed-on: http://review.whamcloud.com/17129
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Bob Glossman [Wed, 23 Sep 2015 18:36:38 +0000 (11:36 -0700)]
LU-7200 kernel: kernel update [SLES11 SP3 3.0.101-0.47.67]
Update SLES11 SP3 kernel to 3.0.101-0.47.67
Test-Parameters: mdsdistro=sles11sp3 ossdistro=sles11sp3 \
clientdistro=sles11sp3 mdsfilesystemtype=ldiskfs \
mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
testgroup=review-ldiskfs
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Icd7bac6bea6866f82e2e03f5dbbb1bda1a4ecacf
Reviewed-on: http://review.whamcloud.com/16617
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Oleg Drokin [Mon, 16 Nov 2015 22:41:54 +0000 (17:41 -0500)]
New tag 2.7.63
Change-Id: I79f285380612f61679c1cf8e51446c9018e8225c
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Mon, 19 Oct 2015 15:24:22 +0000 (11:24 -0400)]
LU-6204 build: clean up kernel module metadata
Update static MODULE_VERSION() lines - this should be automated.
Improve MODULE_DESCRIPTION() descriptions.
Make the name of the module_init()/_exit() functions consistently
{module_name}_init and {module_name}_exit.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I1c3fe5698c7f41d971a38225650597c913500c1e
Reviewed-on: http://review.whamcloud.com/16787
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Olaf Faaland [Tue, 27 Oct 2015 00:51:46 +0000 (17:51 -0700)]
LU-7345 obdclass: annotate locks in __local_file_create
dt_write_lock() is called for both the child and the parent dt_objects
when a directory is created. This triggers a false positive in
lockdep when running with CONFIG_LOCKDEP=y, as the structure
containing the lock and the name of the lock is the same, and so it
appears to be a recursive lock attempt based on lock class.
This gives the two locks different subclasses so lockdep can
differentiate between them.
Also, osd-zfs osd_object_{read,write}_lock() functions currently
ignore the subclass (role) provided by the caller, calling down_read()
instead of down_read_nested() for example.
Make osd_zfs use the _nested variants so the role takes effect.
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Iab79feadfbd7d1a5a06749ecb9f6888b55a78d73
Reviewed-on: http://review.whamcloud.com/16957
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Ben Evans [Wed, 7 Oct 2015 17:30:52 +0000 (12:30 -0500)]
LU-7269 ptlrpc: remove ptlrpc_prep_req
Remove unused functions ptlrpc_prep_req, ptlrpc_prep_req_pool
Combine __ptlrpc_request_bufs_pack and ptlrpc_request_bufs_pack
Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I4e4e64aa1f7fa4c85daf311906f6417a513dcddc
Reviewed-on: http://review.whamcloud.com/16765
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Doug Oucharek [Fri, 30 Oct 2015 21:40:59 +0000 (14:40 -0700)]
LU-7362 lnet: Remove LASSERTS from router checker
In lnet_router_checker(), there are two LASSERTS. Neither protects
us from anything and one of them triggered for a customer crashing
the system unecessarily. This patch removes them.
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: If2732632e47103fb8fa63a263c4c5ef4a44142a3
Reviewed-on: http://review.whamcloud.com/17003
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Matt Ezell <ezellma@ornl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
John L. Hammond [Wed, 14 Oct 2015 16:33:35 +0000 (11:33 -0500)]
LU-7296 ldlm: improve lock timeout messages
In ldlm_expired_completion_wait() remove the useless LCONSOLE_WARN()
message and upgrade the LDLM_DEBUG() statement to LDLM_ERROR().
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I6293720bf8e038057a2c84a715359cdbb8cebe91
Reviewed-on: http://review.whamcloud.com/16824
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Fri, 31 Jul 2015 16:00:20 +0000 (00:00 +0800)]
LU-7120 scrub: handle osd_scrub_post return value
To avoid missing some failure cases during write scrub status
to disk.
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I7f77bdba184f634b4f9dd748c3f1b97609b81960
Reviewed-on: http://review.whamcloud.com/16368
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Wed, 10 Dec 2014 10:53:49 +0000 (03:53 -0700)]
LU-6013 utils: don't initialize OSD code for client mount
Don't even try to initialize the server OSD handling code if this
is a client mountpoint. That avoids potential problems if the OSD
code isn't working or available when it isn't needed on a client.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I104b70b9d27811d306879fc047a83f85ea3ebbe5
Reviewed-on: http://review.whamcloud.com/13019
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Thu, 5 Nov 2015 18:47:06 +0000 (18:47 +0000)]
Revert "LU-4865 zfs: grow block size by write pattern"
This reverts commit
3e4369135127b350dbc26a4a5dc94cfa46e394cf.
This has shown problems in testing and may be the cause of LU-7392.
Change-Id: I664f7f8c943d8a90f2d2a9845aea2636535d6b1e
Reviewed-on: http://review.whamcloud.com/17053
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Niu Yawei [Tue, 3 Nov 2015 06:59:32 +0000 (01:59 -0500)]
LU-7330 ldlm: fix race of starting bl threads
There is race in the code of starting bl threads which leads to
thread number exceeds the maximum number when race happened, it
can also lead to duplicated thread name.
This patch fixes the race and cleanup the code a bit.
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9c9be125d1d76890b8c52476684976dad3cb3d87
Reviewed-on: http://review.whamcloud.com/17026
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Alex Zhuravlev [Mon, 19 Oct 2015 10:18:44 +0000 (13:18 +0300)]
LU-2222 mdt: restore evict-by-nid functionality
Writing a NID or UUID to mdt.*.evict_tgt_nids will evict clients
with NID or UUID specified all the targets (OSTs and MDTs).
Change-Id: I66a60a6c81fbac1571f5685111df7b00a306be36
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/16867
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Alex Zhuravlev [Sun, 1 Nov 2015 18:47:48 +0000 (21:47 +0300)]
LU-7367 tests: fix fail_loc code in 110g
test 110g used wrong code for OBD_FAIL_MIGRATE_NET_REP
Change-Id: I5dc2e18fb99e35422a9ae227e2e16ba7d39600a3
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17009
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
James Simmons [Sun, 1 Nov 2015 16:47:29 +0000 (11:47 -0500)]
LU-7366 build: resolve compile issues on Ubuntu 15.04
Ubuntu 15.04 has been released which uses a 4.2 kernel
and gcc 5.2. The gcc version fails to build lustre due
to missing headers in the GSS userland code and a
variable not being uninitialized in the llite layer.
Change-Id: I3615414ac039277a6ef6c6af1a541590b9d79566
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17008
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Wed, 28 Oct 2015 20:06:13 +0000 (14:06 -0600)]
LU-7353 utils: fix lctl usage messages
Fix the lctl usage message for sub-commands that need the LNet network
to be specified. lctl::g_net_is_set() incorrectly recommended using
the "network" command to specify the LNet network, which is incorrect
when using lctl in command-line mode:
# lctl peer_list
You must run the 'network' command before 'peer_list'.
# lctl network tcp0 peer_list
# lctl --net tcp0 peer_list
12345-192.168.20.1@tcp [1]192.168.40.147->mookie-gig:988 #15
Fix that to correctly recommend using the "--net" command when using
non-interactive mode, and to return an error if "network" is used in
non-interactive mode with extra arguments.
Replace mention of "portals" in help messages with "LNet".
Remove mention of obsolete elan, qsw, ra network types.
Improve the help message content for related subcommands.
Fix whitespace and command descriptions for related subcommands.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Idf9bf663b16012ebc9f38566ecba9859a54cab07
Reviewed-on: http://review.whamcloud.com/16980
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Li Xi [Sat, 24 Oct 2015 05:36:00 +0000 (13:36 +0800)]
LU-7336 ofd: cleanup proc when ofd_info_init fails
In ofd_init0(), if ofd_info_init() fails it should cleanup
procs.
Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I3ff278526f09ef7e36631712ce21a498a6644907
Reviewed-on: http://review.whamcloud.com/16934
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Mikhail Pershin [Wed, 23 Sep 2015 19:09:27 +0000 (12:09 -0700)]
LU-7012 osp: don't use OSP when import is deactivated
Unset opd_imp_connected flag upon IMP_EVENT_INACTIVE event,
it will stop any llog processing by that device until import
will be activated again.
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ie219e536c216130f428ba933d11842511692c95b
Reviewed-on: http://review.whamcloud.com/16937
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Li Xi [Mon, 2 Nov 2015 16:39:31 +0000 (00:39 +0800)]
LU-7371 osd-ldiskfs: fix wrong read length over isize
If the isize is 4095, a read length of 4096 will be
returned because a wrong calculation of EOF. This patch fixes the
problem.
Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I73b18641f000a2d96067243c08c26e51d0d53244
Reviewed-on: http://review.whamcloud.com/17020
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Andreas Dilger [Fri, 9 Oct 2015 06:37:03 +0000 (00:37 -0600)]
LU-7261 ldiskfs: clean up code style for large_xattr
Clean up the code style for the large_xattr patches to match the
upstream kernel style (use ! instead of == 0, and similar), and
the original style of the code before the earlier versions of the
patch was applied.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Icffe339d1b002a55984856829afde9e3eae98bd9
Reviewed-on: http://review.whamcloud.com/16778
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Bob Glossman [Tue, 3 Nov 2015 22:09:32 +0000 (14:09 -0800)]
LU-7379 kernel: kernel update RHEL7.1 [3.10.0-229.20.1.el7]
Test-Parameters: mdsdistro=el7 ossdistro=el7 \
clientdistro=el7 mdsfilesystemtype=ldiskfs \
mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
testgroup=review-ldiskfs
update RHEL 7.1 kernel to 3.10.0-229.20.1.el7
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I690daa6493232353703f5392cfe0a979b824f3f1
Reviewed-on: http://review.whamcloud.com/17044
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Wang Shilong [Tue, 13 Oct 2015 00:28:29 +0000 (20:28 -0400)]
LU-7304 ldiskfs: fix bug when bigalloc is enabled
See following error when enabled bigalloc feature
for ldiskfs rhel7:
LDISKFS-fs error (device sdb):
ldiskfs_mb_check_ondisk_bitmap:3611: comm mkdir:
on-disk bitmap for group 8corrupted: 0 blocks free in
bitmap, 32768 - in gd
Fixed to use EXT4_CLUSTERS_PER_GROUP, otherwise,
we will get wrong value and fail to check, which
make FS become RO..
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I7f61918918e6f4e2f372929181b704b0648dcbca
Reviewed-on: http://review.whamcloud.com/16832
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Fan Yong [Sun, 13 Sep 2015 09:41:13 +0000 (17:41 +0800)]
LU-7354 osd: avoid NULL pointer in osd_obj_update_entry
In osd_obj_update_entry(), the variable @oi_fid may be NULL.
We need to check such case before further using it to avoid
accessing invalid RAM.
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibf47e949d69f0b9e5657a6dce2007fe4f6f1a9f6
Reviewed-on: http://review.whamcloud.com/17010
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>