Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-10698 obdclass: allow specifying complex jobids 91/31691/7
Andreas Dilger [Tue, 20 Mar 2018 09:45:36 +0000 (03:45 -0600)]
LU-10698 obdclass: allow specifying complex jobids

Allow specifying a format string for the jobid_name variable to create
a jobid for processes on the client.  The jobid_name is used when
jobid_var=nodelocal, if jobid_name contains "%j", or as a fallback if
getting the specified jobid_var from the environment fails.

The jobid_node string allows the following escape sequences:

    %e = executable name
    %g = group ID
    %h = hostname (system utsname)
    %j = jobid from jobid_var environment variable
    %p = process ID
    %u = user ID

Any unknown escape sequences are dropped. Other arbitrary characters
pass through unmodified, up to the maximum jobid string size of 32,
though whitespace within the jobid is not copied.

This allows, for example, specifying an arbitrary prefix, such as the
cluster name, in addition to the traditional "procname.uid" format,
to distinguish between jobs running on clients in different clusters:

    lctl set_param jobid_var=nodelocal jobid_name=cluster2.%e.%u
or
    lctl set_param jobid_var=SLURM_JOB_ID jobid_name=cluster2.%j.%e

To use an environment-specified JobID, if available, but fall back to
a static string for all processes that do not have a valid JobID:

    lctl set_param jobid_var=SLURM_JOB_ID jobid_name=unknown

Implementation notes:

The LUSTRE_JOBID_SIZE includes a trailing NUL, so don't use
"LUSTRE_JOBID_SIZE + 1" anywhere, as that is misleading.

Rename the "obd_jobid_node" variable to "obd_jobid_name" to match
the /proc "jobid_name" parameter name to avoid confusion.

Rename "struct jobid_to_pid_map" to "jobid_pid_map" since this is
not actually mapping from a jobid *to* a PID, but the reverse.
Save jobid length, and reorder fields to avoid holes in structure.

Consolidate PID->jobid cache handling in jobid_get_from_cache(),
which only does environment lookups and caches the results.
The fallback to using obd_jobid_name is handled by the caller.

Rename check_job_name() to jobid_name_is_valid(), since that makes
it clear to the reader a "true" return is a valid name.

In jobid_cache_init() there is no benefit for locking the jobid_hash
creation, since the spinlock is just initialized in this function,
so multiple callers of this function would already be broken.

Pass the buffer size from the callers (who know the buffer size) to
lustre_get_jobid() instead of assuming it is LUSTRE_JOBID_SIZE.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Iad350e87b446c7d2356718cf2e5f9563e63ebbe5
Reviewed-on: https://review.whamcloud.com/31691
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9273 tests: disable random I/O in replay-ost-single/5 71/31671/2
Alex Zhuravlev [Fri, 16 Mar 2018 11:28:57 +0000 (14:28 +0300)]
LU-9273 tests: disable random I/O in replay-ost-single/5

disable random I/O in replay-ost-single/5 as it's very slow
on ZFS - this is due to grants as the client consume them
way too quickly: 1MB blocksize + ~0.5MB metadata overhead
for each random 4K written by iozone.

Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5

Change-Id: Ic49429b8c681fdc16e5f95f483d78198b6f4804c
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31671
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
2 years agoLU-10264 mdc: fix possible NULL pointer dereference 21/31621/3
Andreas Dilger [Fri, 9 Mar 2018 23:18:53 +0000 (16:18 -0700)]
LU-10264 mdc: fix possible NULL pointer dereference

Fix two static analysis errors.

lustre/mdc/mdc_dev.c: in mdc_enqueue_send(), pointer 'matched' return
    from call to function 'ldlm_handle2lock' at line 704 may be NULL
    and will be dereferenced at line 705.
If client is evicted between ldlm_lock_match() and ldlm_handle2lock()
the lock pointer could be NULL.

lustre/lov/lov_dev.c:488 in lov_process_config, sscanf format
    specification '%d' expects type 'int' for 'd', but parameter 3
    has a different type '__u32'.
Converting to kstrtou32() requires changing the "index" variable type
from __u32 to u32, which is fine since it is only used internally,
fix up the few functions that are also passing "__u32 index" and the
resulting checkpatch.pl warnings.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3cc80d66bbb537161a561f4f2ba7830ddebcab07
Reviewed-on: https://review.whamcloud.com/31621
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7420 echo: fix echo server to work with unified target 43/18443/11
Mikhail Pershin [Tue, 27 Mar 2018 11:00:47 +0000 (14:00 +0300)]
LU-7420 echo: fix echo server to work with unified target

After Unified Target introduction the echo server lost its
ability to serve incoming request, i.e. works like fake OFD.
Patch restores that functionality, so echo server is able to
process requests from the echo client via network.

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Change-Id: I0c0d347486463ce320c7c66a1f85f6979b9a3681
Reviewed-on: https://review.whamcloud.com/18443
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10752 build: fix rpm packaging issues for gss 57/31757/7
James Simmons [Thu, 29 Mar 2018 17:02:42 +0000 (13:02 -0400)]
LU-10752 build: fix rpm packaging issues for gss

Lustre can create rpms in two ways. One is with make rpm and the
other is using the actual source rpm that is provided. Their are
several issues with how GSS is handled with rpm packaging.

First problem is that you can ./configure --disable-gss which has
never been handled. Secondly if you do configure with disable-gss
it is still possible to have the option enable-gss-keyring set to
yes. The reason it was never seen before is due to everything
being treated with the keyring option. Now if the user sets
enable-gss to no then enable-gss-keyring will also be set to no
even if the user tries to set it to yes. This was done by properly
setting $enable_gss and $enable_gss_keyring in lustre-core.m4.
In the spec file create the bcond gss to handle the gss only case
and we turn on gss if gss_keyring is true. Move lgssc.conf under
the with_gss_keyring bcond which is only needed for server builds
along side lsvcgss.

It is impossible to know if it can build due to the spec file not
properly handling build dependencies for GSS and not knowing if
the kernel is too new for GSS. So the user has to provide the
options --with gss and / or --with gss-keyring to rpmbuild. If
the user only provides gss-keyring option to rpmbuild make sure
it enables gss as well. That is handled in the spec file.

For the case of make rpms fix it up so if gss-keyring is enabled
then by default the core gss handling is enabled. Also handle the
long ignored enable-gss case.

Test-Parameters: trivial

Change-Id: Ieed9df98a27bd6e77504486762d6e60ddca5a916
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31757
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9551 utils: add l_tunedisk to fix disk tunings 64/31464/6
Nathaniel Clark [Wed, 28 Feb 2018 22:18:09 +0000 (17:18 -0500)]
LU-9551 utils: add l_tunedisk to fix disk tunings

This adds l_tunedisk utility to utilize osd_tune_lustre call for
mount_utils.h.  This can be called from udev.
This adds a udev rule to fix disk tunings.
This in some ways duplicates LU-9132, which sets this value at mount
time, but if a multipath component is removed then re-added, the
multipath's max_sectors_kb will not propgate to the newly added device
and this now will cause an error for I/Os that would violate this.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I35330ebe75552d71b71212f9fae00cfdcc028ea1
Reviewed-on: https://review.whamcloud.com/31464
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
2 years agoLU-10773 obdclass: yield cpu during changelog_block_trim_ext 16/31516/2
Fan Yong [Mon, 5 Mar 2018 15:11:21 +0000 (23:11 +0800)]
LU-10773 obdclass: yield cpu during changelog_block_trim_ext

To avoid soft-lockup if there are too many records to be handled.
The patch also filters out zero-sized records to avoid dead loop.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia094f9153b5ef2602103d2ee13ee7ad3ffe6dc4f
Reviewed-on: https://review.whamcloud.com/31516
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10761 osd-ldiskfs: not create REMOTE_PARENT_DIR on OST 08/31508/3
Fan Yong [Fri, 16 Mar 2018 06:28:01 +0000 (14:28 +0800)]
LU-10761 osd-ldiskfs: not create REMOTE_PARENT_DIR on OST

The REMOTE_PARENT_DIR is used to link remote object which parent
resides on remote MDT to the global namespace. It is only useful
for MDT. So it is unnecessary to create such directory on OST.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I240de3f69cde04740cb7f71ebaf9048407a900dc
Reviewed-on: https://review.whamcloud.com/31508
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10837 ldiskfs: skip bitmap check if block bitmap is uninitialized 20/31720/3
Wang Shilong [Thu, 22 Mar 2018 05:59:55 +0000 (13:59 +0800)]
LU-10837 ldiskfs: skip bitmap check if block bitmap is uninitialized

See comments in ext4_free_clusters_after_init:
/* Return the number of free blocks in a block group.  It is used when
 * the block bitmap is uninitialized, so we can't just count the bits
 * in the bitmap. */
So extra check we enhanced here is wrong if this block group
bitmap is uninitialized, since we only check bitmaps here.

Further, Looking at EXT4_BG_BLOCK_UNINIT clear codes, Kernel
will reinit free_clusters_count when tried to clear the flag, so
extra check for uninited block bitmaps dosen't make much sense.

Let's skip uninited block bitmap check if EXT4_BG_BLOCK_UNINIT
is set, whatever free count group desc recorded is untrustable somehow

Change-Id: I845f2e0e17e53b7e3073399bd8b0a85e3db66ef8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31720
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-10703 nodemap: save and clear fileset correctly 50/31450/13
Emoly Liu [Tue, 20 Mar 2018 09:42:29 +0000 (17:42 +0800)]
LU-10703 nodemap: save and clear fileset correctly

This patch is to fix the following two issues:
- When processing the nodemap_idx_type "NODEMAP_CLUSTER_IDX" in
  nodemap_process_keyrec(), fileset should be saved, otherwise,
  it will be changed to empty every time when client is notified
  to fetch nodemap logs (mgc_process_recover_nodemap_log()->
  nodemap_process_idx_pages()->nodemap_process_keyrec()).
- Allow 'fileset=clear' in addition to 'fileset=""' to clear
  fileset because either 'lctl set_param -P *.*.fileset=""' or
  'lctl nodemap_set_fileset --fileset ""' can only work on MGS,
  while on other non-MGS servers, they both will invoke upcall
  "/usr/sbin/lctl set_param nodemap.default.fileset=" by function
  process_param2_config(), which will cause "no value" error and
  won't clear fileset. 'fileset=""' is still kept for compatibility
  reason.

Also, sanity-sec.sh test_27a is modified and test_27b is added to
verify this patch.

Change-Id: I23236a4f1b67ac555713d6b3f059df699fdc91dc
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/31450
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10830 utils: fix create mode for lfs setstripe 47/31747/4
Andreas Dilger [Fri, 23 Mar 2018 06:01:58 +0000 (00:01 -0600)]
LU-10830 utils: fix create mode for lfs setstripe

Fix create mode for files created by "lfs setstripe" and also
"lfs mirror create" to match regular file creates, which are
filtered by umask to determine the final file create mode.

Add test case to verify umask is working correctly in all cases.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I0c9d6730f437dbfbafda4902a035cc0f0ed916b0
Reviewed-on: https://review.whamcloud.com/31747
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew tag 2.11.50 2.11.50 v2_11_50 v2_11_50_0
Oleg Drokin [Tue, 3 Apr 2018 17:30:11 +0000 (13:30 -0400)]
New tag 2.11.50

Start of 2.12 development

Change-Id: Ic96437e600ab6d460ea33cf48b36c88913f5d864
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew release 2.11 b2_11 2.11.0 v2_11_0 v2_11_0_0
Oleg Drokin [Tue, 3 Apr 2018 17:25:32 +0000 (13:25 -0400)]
New release 2.11

Change-Id: I2e6ea245c130823534a50a14056c4865572f181e
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew RC 2.11.0-RC3 2.11.0-RC3 v2_11_0_0_RC3 v2_11_0_RC3
Oleg Drokin [Fri, 30 Mar 2018 22:06:39 +0000 (18:06 -0400)]
New RC 2.11.0-RC3

Change-Id: Iee4f556142bf4f2a9efe61469c95e09fe460ddc0
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10822 utils: stop bogus buffer overflow errors 22/31822/6
Andreas Dilger [Wed, 28 Mar 2018 21:42:06 +0000 (15:42 -0600)]
LU-10822 utils: stop bogus buffer overflow errors

Over-zealous Fortify checks assume that the buffer being used for
snprintf() in get_lmd_info() is sizeof(*lmd) when in fact a larger
buffer has been allocated.  This causes runtime checks to fail and
lfs to core dump:

   *** buffer overflow detected ***: /usr/bin/lfs terminated

Instead of printing directly into "struct lov_user_mds_data", use
a generic buffer to hold the filename passed into the ioctl and
the return data.

There are several places in the code which do the same operations,
namely cb_getstripe(), get_lmd_info(), and ct_md_getattr(), so
change them all to call get_lmd_info() or a new get_lmd_info_fd()
helper to consolidate common code.  Also check the return values
from snprintf() in case there are new callers of this code in the
future that do not actually pass large-enough buffers.

Test-Parameters: clientdistro=ubuntu1604 serverdistro=el7 testlist=sanity
Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I41b1fcba1f7937fbce3cc7180ed5d73d067cab07
Reviewed-on: https://review.whamcloud.com/31822
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10334 tests: add tests to ALWAYS_EXCEPT for Ubuntu 28/31828/5
James Nunez [Thu, 29 Mar 2018 19:25:36 +0000 (13:25 -0600)]
LU-10334 tests: add tests to ALWAYS_EXCEPT for Ubuntu

Several tests are known to fail when running on Ubuntu clients:
tests 103a, 130a, 130b, 130c, 130d, 130e, 400a, and 410.

Add these tests to the ALWAYS_EXCEPT list to allow Ubuntu
testing to pass.

Test-Parameters: trivial
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I5ff51e94536f4382d670c9a4a1ce0af0c2832b4c
Reviewed-on: https://review.whamcloud.com/31828
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
2 years agoLU-10864 build: update changelog for ubuntu 16.04 26/31826/3
Minh Diep [Thu, 29 Mar 2018 17:37:59 +0000 (10:37 -0700)]
LU-10864 build: update changelog for ubuntu 16.04

update chanage to the kernel we are building

Test-Parameters: trivial

Change-Id: Ic3a6accda4fc19d56676e2fb84f65942bc107539
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/31826
Reviewed-by: Peter Jones <peter.a.jones@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Joseph Gmitter <joseph.gmitter@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10858 build: handle yaml library packaging on SLES systems 15/31815/3
James Simmons [Wed, 28 Mar 2018 18:07:45 +0000 (14:07 -0400)]
LU-10858 build: handle yaml library packaging on SLES systems

Newer distributions like SLES12 renamed the libyaml package to
libyaml-0-2. Update the spec file to handle this change.

Test-Parameters: clientdistro=sles12sp3 \
ossdistro=sles12sp3 mdsdistro=sles12sp3 \
testlist=sanity,sanity-pfl,sanity-flr

Change-Id: I876d05718194dd555d7d6ffa6433bcc9f445f97e
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31815
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoRevert "LU-6867 test: detect active facet based on current state" 98/31798/5
James Nunez [Tue, 27 Mar 2018 18:28:49 +0000 (18:28 +0000)]
Revert "LU-6867 test: detect active facet based on current state"

This reverts commit 643e3b4316b6c59009c259b96d38495152989df4.

conf-sanity is failing with rmmod errors for Ubuntu clients; LU-10827.
Reverting this patch fixes the issue.

Change-Id: I455d87ea1e2f661c6129c9de577fc660d68d4c4b
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-on: https://review.whamcloud.com/31798
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew RC 2.11-RC2 2.11.0-RC2 v2_11_0_0_RC2 v2_11_0_RC2
Oleg Drokin [Mon, 26 Mar 2018 22:31:06 +0000 (18:31 -0400)]
New RC 2.11-RC2

Change-Id: Ib5387f4cc463759452d26d4ad539201bd4c82717
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10829 utils: don't print lmm_stripe_offset for DoM layout 02/31702/2
Andreas Dilger [Tue, 20 Mar 2018 23:37:04 +0000 (17:37 -0600)]
LU-10829 utils: don't print lmm_stripe_offset for DoM layout

Running "lfs getstripe" on a DoM file prints out a non-zero value for
"lmm_stripe_offset:" on the 'mdt' component, even though this doesn't
make any sense.  Also, it prints an "lmm_objects:" header for the
component, even though it does not have any objects allocated to it.

  lcm_layout_gen:    4
  lcm_mirror_count:  1
  lcm_entry_count:   3
    lcme_id:             1
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 0
    lcme_extent.e_end:   1048576
      lmm_stripe_count:  0
      lmm_stripe_size:   1048576
      lmm_pattern:       mdt
      lmm_layout_gen:    0
      lmm_stripe_offset: 2
      lmm_objects:

Always print '0' for lmm_stripe_offset of DoM components, and don't
print "lmm_objects:" for these components at all.

Test-Parameters: trivial testlist=sanity-dom,sanity-flr
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I5430ff74d26ad2acd51d07ec23810cc9033ebbe5
Reviewed-on: https://review.whamcloud.com/31702
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10556 build: add require libyaml and zlib 10/31710/2
Minh Diep [Wed, 21 Mar 2018 19:30:42 +0000 (12:30 -0700)]
LU-10556 build: add require libyaml and zlib

Missing libyaml and zlib dev package

Test-Parameters: trivial

Change-Id: I167187c7bd11a2d92a6cc1fa8ccd7076f7ed5a85
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/31710
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-10569 build: properly package lustre for Debian/Ubuntu 48/31348/17
James Simmons [Tue, 6 Mar 2018 20:10:56 +0000 (15:10 -0500)]
LU-10569 build: properly package lustre for Debian/Ubuntu

Remove the obsolete linux-patch since patched kernels for lustre
clients have been long gone. Place only the static libraries and
*.so symlinks for the dynamic libraries in lustre-dev. The normal
dynamic libraries are placed into the utilities packages. Add in
all the missing dependencies and fix how the lustre debs are
dependent on each other. Lastly add in the missing lustre-iokit
that is present for rpm packages. Only thing missing is a package
for lustre resources which can be done at a latter time.

Test-Parameters: trivial

Change-Id: I5fd2a23bc1ae73434cef8dcf3679b50878256ab3
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/31348
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Tested-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew release candidate 2.11-RC1 2.11.0-RC1 v2_11_0_0_RC1 v2_11_0_RC1
Oleg Drokin [Mon, 19 Mar 2018 18:47:23 +0000 (14:47 -0400)]
New release candidate 2.11-RC1

Change-Id: I8ced43420aa756e242a87f50ffd3601b76b4eb9e
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoRevert "LU-9796 kernel: improve metadata performaces for RHEL7" 83/31683/3
Andreas Dilger [Mon, 19 Mar 2018 01:20:24 +0000 (01:20 +0000)]
Revert "LU-9796 kernel: improve metadata performaces for RHEL7"

This reverts commit 17fe3c192e101ac due to suspected
problems hit in some deployments.

Change-Id: I8cb28b4c69f67583356a7e07cf94ba897ffeb6ee
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-on: https://review.whamcloud.com/31683
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10800 lnet: Revert "LU-10270 lnet: remove an early rx code" 75/31675/3
John L. Hammond [Fri, 16 Mar 2018 15:20:42 +0000 (10:20 -0500)]
LU-10800 lnet: Revert "LU-10270 lnet: remove an early rx code"

This reverts commit c3894ff80fe4b48f2d62ea33ddc54fb5891e6484. Dropping
early receives caused pings to be ignored and interacted badly with
dynamic discovery.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I99a87a8f58ea67c59d5e85b964295472c2e15de4
Reviewed-on: https://review.whamcloud.com/31675
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10800 lnet: reduce discovery timeout 63/31663/3
Amir Shehata [Thu, 15 Mar 2018 19:12:04 +0000 (12:12 -0700)]
LU-10800 lnet: reduce discovery timeout

Discovery protocol sends a ping (GET) to the peer and expects a
REPLY back with the interface information. Discovery uses the
DEFAULT_PEER_TIMEOUT which 180s. This could lead to extended delay
during mounting if the OSTs are down or if the ping fails for
any reason.

This patch adds a module parameter lnet_transaction_timeout which
defaults to 5 seconds. lnet_transaction_timeout is used for the
discovery timeout.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ida1e19f55552b24e83c8094aa88a37c2748126cf
Reviewed-on: https://review.whamcloud.com/31663
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10804 echo: allow echo server to setup procfs 64/31664/4
Mikhail Pershin [Thu, 15 Mar 2018 19:30:39 +0000 (22:30 +0300)]
LU-10804 echo: allow echo server to setup procfs

Restore lprocfs init for echo server. It is still using
procfs for stats.

Fixes: 0100ab268c3120aa84847a88a2493988f38dee6b
Test-Parameters: trivial envdefinitions=SLOW=yes testlist=obdfilter-survey osscount=1 ostcount=4 mdscount=1 mdtcount=1
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I7a1bf7de3d7c3202e6da7545da63979555ce6624
Reviewed-on: https://review.whamcloud.com/31664
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
2 years agoLU-5761 tests: fix test_89 to use fs_log_size() 20/31120/6
Andreas Dilger [Sat, 3 Feb 2018 08:27:42 +0000 (01:27 -0700)]
LU-5761 tests: fix test_89 to use fs_log_size()

The test_89 checks should use fs_log_size() to determine how much
space might be leaked "normally" (due to log files, etc), and how
much data should be written to ensure that we do not misinterpret
this as the leak of block.

Also, fix up fs_log_size() to use the correct grant_block_size
units, which are in bytes, but fs_log_size() returns size in KB.
Allow a margin of 2 large blocks to be allocated for ZFS.

Test-Parameters: trivial ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-single,replay-single,replay-single
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Id55175fc7f25fea52345d1c4443673b7efcec230
Reviewed-on: https://review.whamcloud.com/31120
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10675 tests: increase default MDSSIZE 25/31325/7
Alex Zhuravlev [Thu, 15 Feb 2018 19:52:48 +0000 (22:52 +0300)]
LU-10675 tests: increase default MDSSIZE

and fix few tests to release space

Change-Id: Ie5e5b3f440e3abbd1f75486d2c6a3928a382be7d
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31325
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
2 years agoLU-10793 test: re-add test_14b to replay-dual ALWAYS_EXCEPT 05/31605/3
Hongchao Zhang [Thu, 15 Feb 2018 12:39:12 +0000 (20:39 +0800)]
LU-10793 test: re-add test_14b to replay-dual ALWAYS_EXCEPT

The test_14b in replay-dual is removed from ALWAYS_EXCEPT list
in LU-10052 by https://review.whamcloud.com/#/c/30916/, but
the corresponding implementation is not ready, and this patch
re-add it to the ALWAYS_EXCEPT.

Test-Parameters: trivial testlist=replay-dual

Change-Id: I1027046b668e21f9fe4a47a0f46810f64b1ee954
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31605
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
2 years agoLU-5490 tests: Sanity/133d ensure stats read is on correct MDT 85/31585/3
Nathaniel Clark [Thu, 8 Mar 2018 15:40:40 +0000 (10:40 -0500)]
LU-5490 tests: Sanity/133d ensure stats read is on correct MDT

Ensure directories used to collect rename_stats are on the MDT
that is checked.  This ensures directories are created on
MDT0 and not striped and then rename_stats is read from MDT0.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib27f5c531f2d8bd664ec3a4732c512b0c389dc43
Reviewed-on: https://review.whamcloud.com/31585
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
2 years agoLU-10764 hsm: Correct debug print in ct_archive 09/31509/3
Oleg Drokin [Mon, 5 Mar 2018 06:15:50 +0000 (01:15 -0500)]
LU-10764 hsm: Correct debug print in ct_archive

As is it's never printed due to misplaced curly bracket

Test-Parameters: trivial
Change-Id: I15c60f2ec44aaaa723945068d576dc59e04a2b95
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/31509
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
2 years agoLU-10759 test: sanity tests need check for 2 or more OSTs 95/31495/3
Bobi Jam [Fri, 2 Mar 2018 18:31:06 +0000 (11:31 -0700)]
LU-10759 test: sanity tests need check for 2 or more OSTs

sanity test 27F, 311, and 314 need two or more OSTs. Add a check
on the number of OSTs in these test cases.

Test-Parameters: trivial osscount=1 ostcount=1

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ie802429bfeb44ee19d8867614b420de7bceebfa2
Reviewed-on: https://review.whamcloud.com/31495
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-4315 doc: correct lfs migrate man page separation 06/31406/3
Andreas Dilger [Wed, 21 Feb 2018 06:40:37 +0000 (23:40 -0700)]
LU-4315 doc: correct lfs migrate man page separation

The "--block" and "--non-block" options are not relevant for
lfs-setstripe.1, only lfs-migrate.1 so move their descriptions
there.

Also remove duplicate setstripe option descriptions from
lfs-migrate.1, since they are becoming increasingly complex.
Instead, just refer to the lfs-setstripe.1 man page for other
options.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I13a28b33be2dc29aa0f44d177a62dbd2e13ebbe5
Reviewed-on: https://review.whamcloud.com/31406
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10673 tests: sanity test_56a fixes 26/31326/4
Alyona Romanenko [Thu, 15 Feb 2018 19:01:05 +0000 (22:01 +0300)]
LU-10673 tests: sanity test_56a fixes

The $filenum is not equal to $found if stripe_count
more then 1.
The $filenum is not equal to $found if stripe_index
is not default.
Patch fixes the following:
 We will counted files twice with dual striped
 file as they will have objects on both stripes.
 Remove dir's stripe-offset from stripes-offset's sum of
 test dirs/files which get by getstripe -ir.

Author: Alyona Romanenko <alyona.romanenko@seagate.com>

Signed-off-by: Alyona Romanenko <alyona.romanenko@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanity envdefinitions=ONLY=56a
Cray-bug-id: MRP-2738
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Change-Id: I911bf8b40b7688b4341f48409d9c5b57386cfe3d
Reviewed-on: https://review.whamcloud.com/31326
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10210 tests: Add lustre_routes_conversion script in PATH 73/31173/2
Sonia Sharma [Mon, 5 Feb 2018 18:53:00 +0000 (10:53 -0800)]
LU-10210 tests: Add lustre_routes_conversion script in PATH

Fix the typo in test-framework.sh so that test_67 in
conf-sanity.sh find lustre_routes_conversion script
when running out of build tree.

Fixing the typo in test-framework.sh for exporting
LUSTRE_ROUTES_CONVERSION to pick the lustre_routes_conversion
script from $LUSTRE/scripts so that it is visible when running
out of build tree.

Change-Id: I1bd9a28e036b9c7b60eaa9886e641610d414c8ee
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/31173
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7854 tests: start gss daemons in sanity-gss 83/27383/15
Sebastien Buisson [Thu, 1 Jun 2017 18:37:32 +0000 (14:37 -0400)]
LU-7854 tests: start gss daemons in sanity-gss

In sanity-gss, launch lsvcgssd with '-z' flag prior to
commencing actual tests. And stop daemons at the end of the script.
The purpose of this patch is just to fix the test script, so passing
test_1 only is fine.

Test-Parameters: trivial envdefinitions=ONLY=1 testlist=sanity-gss
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib118b3735c74bb74a54b323ee8eec91d05491edf
Reviewed-on: https://review.whamcloud.com/27383
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-7854 gss: install lgssc.conf under /etc/request-key.d/ 17/31317/4
Sebastien Buisson [Thu, 15 Feb 2018 14:24:37 +0000 (23:24 +0900)]
LU-7854 gss: install lgssc.conf under /etc/request-key.d/

GSS keys for Lustre are generated via the lgss_keyring user-space
tool. But request-key system tool needs to know how to call
lgss_keyring in order to generate keys for Lustre.
This is done by adding the file lgssc.conf file under
/etc/request-key.d/, with the following content:
create lgssc * * /usr/sbin/lgss_keyring %o %k %t %d %c %u %g %T %P %S

This file is not packaged if gss keyring is explicitely disabled at
configure time.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibf2eb04584f6a100a57bf00070335cf4cf2c620c
Reviewed-on: https://review.whamcloud.com/31317
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7947 obdclass: Move assignment below LASSERT() 61/24561/5
Arshad Hussain [Tue, 27 Dec 2016 16:45:39 +0000 (22:15 +0530)]
LU-7947 obdclass: Move assignment below LASSERT()

This patch moves 'loghandle->lgh_hdr' assignment call
below LASSERT(). This avoids a case when loghandle parameter
is NULL and dereferencing the NULL pointer would fault
before it reaches LASSERT().

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@seagate.com>
Change-Id: Ie9bcd172a264e104dca300a8bac04d2bd132efb0
Reviewed-on: https://review.whamcloud.com/24561
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8503 tests: fix replay-single/66b test 41/21941/5
Elena Gryaznova [Fri, 16 Feb 2018 09:46:30 +0000 (12:46 +0300)]
LU-8503 tests: fix replay-single/66b test

In replay-single test_66b replace lookup with touch
to ensure a new RPC is sent on each test invocation.

Author: Abrarahmed Momin <abrar.habib@seagate.com>

Signed-off-by: Abrarahmed Momin <abrar.habib@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Test-Parameters: trivial testlist=replay-single envdefinitions=ONLY=66b
Cray-bug-id: LUS-4868
Seagate-bug-id: MRP-3386
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Change-Id: I87e91cccd8af92fe9ca2002127af934b8b02edfb
Reviewed-on: https://review.whamcloud.com/21941
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10772 utils: incorrect NULL check in free_node() 51/31551/2
Sonia Sharma [Tue, 6 Mar 2018 18:28:09 +0000 (10:28 -0800)]
LU-10772 utils: incorrect NULL check in free_node()

In lnet/utils/lnetconfig/cyaml.c, for free_node()
check first for NULL pointer before dereferencing it.

Issue found in Static analysis

Change-Id: I6298f0f09175b6fd210db5717d44d050b1cb9d8d
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/31551
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10658 utils: check mount_lustre for allocation failure 00/31400/3
Andreas Dilger [Fri, 23 Feb 2018 18:26:40 +0000 (11:26 -0700)]
LU-10658 utils: check mount_lustre for allocation failure

Check if calloc() failed and return an error rather than dereferencing
the NULL pointer.  Found by static analysis.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie4a5e3341fab1de77990fc99df54cdc562dcab07
Reviewed-on: https://review.whamcloud.com/31400
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10752 build: properly package lsvcgss for rpm builds 85/31485/9
James Simmons [Mon, 5 Mar 2018 17:15:16 +0000 (12:15 -0500)]
LU-10752 build: properly package lsvcgss for rpm builds

On some platforms rpm building will failure with the following
errors:

RPM build errors:
    Installed (but unpackaged) file(s) found:
   /etc/init.d/lsvcgss

Technically lsvcgss is a server only file so we can just include
it for server builds and only if GSS_KEYRING is set.

Test-Parameters: trivial

Change-Id: I2525916cd10ddea0b99337e1ff4ff967bd9f7f9a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31485
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8856 osd: mark specific transactions netfree 44/31444/9
Alex Zhuravlev [Wed, 3 May 2017 12:45:13 +0000 (15:45 +0300)]
LU-8856 osd: mark specific transactions netfree

osd-zfs should mark some transactions netfree. this means those
transactions are expected to release space (rather than consume)
and for this kind of transaction half of reserved space is available.

Change-Id: Ia5ca247843b296319376c4ac69efad68b557df9f
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31444
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
2 years agoLU-684 tests: replace dev_read_only patch with dm-flakey 00/7200/40
Hongchao Zhang [Sat, 3 Mar 2018 06:36:32 +0000 (22:36 -0800)]
LU-684 tests: replace dev_read_only patch with dm-flakey

The dev_read_only kernel patch is mainly used for testing,
in order to simulate a server crash for ldiskfs by discarding
all of the writes to the device.

Since Linux kernel 3.0, this testing functionality can be
simulated by using "dm-flakey" target for device-mapper,
which supports a "drop_writes" parameter that could be used
in place of our dev_read_only kernel patch.

Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I51ff9a1a10fb5bacdc1afa2716b769b5eda41863
Reviewed-on: https://review.whamcloud.com/7200
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10803 ptlrpc: fix req_buffers_max and req_history_max setting 22/31622/3
Wang Shilong [Mon, 12 Mar 2018 11:51:23 +0000 (19:51 +0800)]
LU-10803 ptlrpc: fix req_buffers_max and req_history_max setting

We hit LU-9372 OOM problems, and after applying
LU-9372 ptlrpc: allow to limit number of service's rqbds
we found two problems:

1)Since 0 is a reserved value for @srv_nrqbds_max which
means unlimited value, procfs write interface should support
this value, otherwise, there is no way to change default behavior
back.

2)the check in ptlrpc_lprocfs_req_history_max_seq_write() was broken
after this patch, the following check will always succeed if @srv_nrqbds_max
is kept as default value 0:

val > svc->srv_nrqbds_max/2

Change-Id: Ida0796fa500fe595e003accc11d20fdad5e60c03
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31622
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew tag 2.10.59 2.10.59 v2_10_59 v2_10_59_0
Oleg Drokin [Mon, 12 Mar 2018 15:08:38 +0000 (11:08 -0400)]
New tag 2.10.59

Change-Id: I3d21da3edd4b9851191db9dd0467015787acd5a5
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10794 lfs: make quota work for grace time 06/31606/3
Wang Shilong [Fri, 9 Mar 2018 05:25:07 +0000 (13:25 +0800)]
LU-10794 lfs: make quota work for grace time

Following commit:
LU-10011 utils: refactor lfs quota codes

Introduce a regression which will make 'lfs quota -t'
will output nothing, fix this bug and also add
a test case in sanity-quota.sh in case it is broken
in the future again.

Test-Parameters: trivial testlist=sanity-quota
Change-Id: I2063552505cf07464d9924f66c29fc2504bc56ce
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31606
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10783 kernel: kernel update RHEL7.4 [3.10.0-693.21.1.el7] 12/31612/2
Bob Glossman [Tue, 6 Mar 2018 22:06:18 +0000 (14:06 -0800)]
LU-10783 kernel: kernel update RHEL7.4 [3.10.0-693.21.1.el7]

update RHEL 7.4 kernel to 3.10.0-693.21.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ib7d5233d438798e1cdd1c31bb6728f8ea6697959
Reviewed-on: https://review.whamcloud.com/31612
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10750 mdd: declare changelogs only when enabled 77/31477/8
John L. Hammond [Thu, 1 Mar 2018 16:02:09 +0000 (10:02 -0600)]
LU-10750 mdd: declare changelogs only when enabled

In the mdd layer, rename recording_changelog() to
mdd_changelog_enabled() and add the changelog record type as a
parameter. In mdd_changelog_enabled() test to see if the type is
enabled in addition to checking is changelogs are generally enabled
and only lookup the ucred if the other tests pass. Add a type
parameter to mdd_declare_changelog_store() so that this information
can be passed to mdd_declare_changelog_store(). In mdd_close() check
if CLOSE changelogs are enabled before opening a transaction and
declaring the record.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Idd7604de5e97bad72a802cb4b49dae4668b2644a
Reviewed-on: https://review.whamcloud.com/31477
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
2 years agoLU-10465 lov: decrease default stripe size to 1MB 89/31589/2
Jian Yu [Thu, 8 Mar 2018 19:08:59 +0000 (11:08 -0800)]
LU-10465 lov: decrease default stripe size to 1MB

Commit 3f5abc6fa30e7c0256077ccf6a149d1809450465 increased
the default stripe size from 1MB to 4MB. However, this
caused usability issue in LU-10786 for PFL/DoM files.

This patch changes the default stripe size back to 1MB
until we have a better method of handling DoM components.
Otherwise, it means that DoM files will not be created
easily with default settings.

Change-Id: Ie6b6fe97596ed65abec771b3f37afd950dc821c8
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31589
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoRevert "LU-10419 lfsck: skip dead target" 00/31600/2
Oleg Drokin [Fri, 9 Mar 2018 00:19:51 +0000 (00:19 +0000)]
Revert "LU-10419 lfsck: skip dead target"

This is causing uninterruptible lfsck instances in soak testing documented in LU-10419 by Cliff

This reverts commit 012834c5e7c7be50ff117cee4ac473d7fee4294d.

Change-Id: I119d21c7ce3375140fbbb25a300e65b4c6aa9e73
Reviewed-on: https://review.whamcloud.com/31600
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10722 test: Add version check to sanity-quota test_55 31/31531/2
Wei Liu [Mon, 5 Mar 2018 18:38:43 +0000 (10:38 -0800)]
LU-10722 test: Add version check to sanity-quota test_55

Skip sanity-quota test_55 if server is older than 2.10.58

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Change-Id: Ia8a129298d75fb019699adda07fecd2f4d9eb46a
Reviewed-on: https://review.whamcloud.com/31531
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10705 utils: add "lfs find --blocks" 93/31393/4
Andreas Dilger [Fri, 23 Feb 2018 07:34:22 +0000 (00:34 -0700)]
LU-10705 utils: add "lfs find --blocks"

Add support for "lfs find --blocks|-b <block>" to be able to find
files with the specified number of allocated blocks (in kilobytes or
other specified units). This is distinct from "--size <size>" since
that doesn't properly check the space used for sparse files.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I7d48f919d95242c11ef7d3075ecc3f7e963ebbe5
Reviewed-on: https://review.whamcloud.com/31393
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10596 tests: skip tests require remote server with nodsh 21/31121/6
Elena Gryaznova [Sun, 4 Mar 2018 18:17:38 +0000 (21:17 +0300)]
LU-10596 tests: skip tests require remote server with nodsh

Patch fixes the following tests to be skipped for remote
servers with nodsh set:
sanity 56c, 60aa, 77c, 101g, 160f, 160g, 161d
Patch skips 160f and 160g for old MDS.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanity
Cray-bug-id: MRP-4757, LUS-5710
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I44f35129df5bc5c8c6e6ace3e68f3f2d400db86c
Reviewed-on: https://review.whamcloud.com/31121
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-10336 osp: wakeup opd_pre_waitq when decrement opd_pre_reserved 97/30397/4
Sergey Cheremencev [Wed, 6 Dec 2017 13:52:33 +0000 (16:52 +0300)]
LU-10336 osp: wakeup opd_pre_waitq when decrement opd_pre_reserved

osp_precreate_cleanup_orphans could be blocked due to
reserved objects. In such case it set opd_pre_recovering
flag and waits until opd_pre_reserved becomes 0.
Thus we need to wake it up when opd_pre_reserved is reset
to 0.

Change-Id: Ib8d4708685c3c9675872577985a4c6897e3ee385
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Cray-bug-id: MRP-3623
Reviewed-on: https://review.whamcloud.com/30397
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9160 ldiskfs: preload block group descriptors 22/25722/7
Artem Blagodarenko [Sat, 18 Feb 2017 09:00:13 +0000 (12:00 +0300)]
LU-9160 ldiskfs: preload block group descriptors

With 300TB OST size, we saw slow mount time, which
caused 13 minutes, with this patch applied, it reduced
to 30s, so this patch greatly reduce mount time, backport
it from Linux upstream.

Linux-commit: 85c8f176a6111ecde9c158109989dbd445a0e59a

With enabled meta_bg option block group descriptors
reading IO is not sequential and requires optimization.

Seagate-bug-id: MRP-4129
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Change-Id: Iaa621c11ff88364021887d9f9dcec250dd5fd955
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/25722
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-10723 tests: disable sanity 232b before 2.10.58 87/31487/2
Quentin Bouget [Fri, 2 Mar 2018 08:22:25 +0000 (08:22 +0000)]
LU-10723 tests: disable sanity 232b before 2.10.58

The fix that allows test_232b of sanity.sh to pass was introduced in
lustre 2.10.58 so the test should not be run before this version.

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I7c625e916bfd0d4a614cc9924670bffe4ba3b8b0
Reviewed-on: https://review.whamcloud.com/31487
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10483 lustre: replace FMODE_{READ,WRITE} with MDS_* equivs 24/30824/10
Sebastien Buisson [Wed, 10 Jan 2018 14:37:24 +0000 (23:37 +0900)]
LU-10483 lustre: replace FMODE_{READ,WRITE} with MDS_* equivs

In file lustre/include/uapi/linux/lustre/lustre_user.h, replace direct
use of FMODE_READ and FMODE_WRITE with MDS_* equivalents.
That will avoid name clashes with the kernel symbols, and avoid
problems if their values ever change.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I07e77d8d025c5ddb3dc4e085738645e20fb77d0c
Reviewed-on: https://review.whamcloud.com/30824
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10003 lnet: remove lctl deprecation messages 34/31534/3
John L. Hammond [Mon, 5 Mar 2018 23:11:25 +0000 (17:11 -0600)]
LU-10003 lnet: remove lctl deprecation messages

Defer deprecation of these commands for now.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I09b97bacded9ac65a8c5df3ba47867a6a19fbf7b
Reviewed-on: https://review.whamcloud.com/31534
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-10419 lfsck: skip dead target 75/31475/2
Fan Yong [Thu, 1 Mar 2018 06:30:36 +0000 (14:30 +0800)]
LU-10419 lfsck: skip dead target

Do not send LFSCK RPC to dead targets to avoid being blocked.
The patch adds warning message when try to send LFSCK RPC on
the non-full connection, it is helpful to understand why the
LFSCK may be blocked.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I0599eb961f1aabd58d0de53fd51f25ca1ec8ff34
Reviewed-on: https://review.whamcloud.com/31475
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10769 osd-zfs: fix deadlock on osd_object::oo_guard 11/31511/4
Fan Yong [Mon, 5 Mar 2018 11:35:02 +0000 (19:35 +0800)]
LU-10769 osd-zfs: fix deadlock on osd_object::oo_guard

There is race condition inside osd-zfs, it may cause deadlock.
Consider the following scenarios:

1) The Thread1 calls osd_attr_set() to set flags on the object.
   The osd_attr_set() will call the osd_xattr_get() with holding
   the read mode semaphore on the object::oo_guard.

2) The Thread2 calls the osd_declare_destroy() to destroy such
   object, it will down_write() on the object::oo_gurad, but be
   blocked by the Thread1's granted read mode semaphore.

3) The osd_xattr_get() triggered by the osd_xattr_set() will also
   down_read() on the object::oo_guard. But it will be blocked by
   the Thread2's pending down_write() request.

Then the Thread1 and the Thread2 deadlock.
This patch makes the osd_attr_set() to call the lockless version
xattr_get osd_xattr_get_internal() to avoid such deadlock.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iaac2e414b5f1fd197303bb7ec7d5e2763b6f3e9a
Reviewed-on: https://review.whamcloud.com/31511
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-10681: Disable tiny writes for append 53/31353/8
Patrick Farrell [Sat, 3 Mar 2018 22:59:43 +0000 (16:59 -0600)]
LU-10681: Disable tiny writes for append

Unfortunately, tiny writes do not work correctly with
appending to files.  When appending to a file, we must take
DLM locks to EOF on all stripes, in order to protect file
size so we can append correctly.

If we dirty a page with a normal write then append to it
with a tiny write, these DLM locks are not present, and we
can use an incorrect size if another client writes to a
different stripe, increasing the size without cancelling
the lock which is protecting our dirty page.

We could theoretically check to make sure the required DLM
locks are held, but this would be time consuming.

The simplest solution is to just not allow tiny writes when
appending.

Also add option to disable tiny writes at runtime.

Cray-bug-id: LUS-5723

Change-Id: Ic9421faa3d0268d907040881e8ba3c894261fd49
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/31353
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10520 mkfs: enable extents for big MDT 37/31037/13
Yang Sheng [Fri, 26 Jan 2018 13:35:33 +0000 (21:35 +0800)]
LU-10520 mkfs: enable extents for big MDT

Enable extents while MDT size is big than 16T.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Iccd39c48e715a3f084cb5ee803be0541563f5d10
Reviewed-on: https://review.whamcloud.com/31037
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
2 years agoLU-10680 mdd: disable changelog garbage collection by default 52/31552/2
John L. Hammond [Tue, 6 Mar 2018 19:25:50 +0000 (13:25 -0600)]
LU-10680 mdd: disable changelog garbage collection by default

Changelog garbage collection has introduced some instability so
disable it by default.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I708198d76af060cb796de89266ee74a968f92ac1
Reviewed-on: https://review.whamcloud.com/31552
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10786 tests: add stripe size to lfs setstripe 69/31569/3
James Nunez [Wed, 7 Mar 2018 16:27:59 +0000 (09:27 -0700)]
LU-10786 tests: add stripe size to lfs setstripe

Since the default stripe size increased from one to four
MB, we need to add the stripe size parameter to calls
to 'lfs setstripe' for composite files when the component
size is less than the file system stripe size. Thus, add
the stripe size parameter to calls to 'lfs setstripe' for
sanity-flr tests 45 and 46 and sanity-pfl test 16.

Test-Parameters: trivial testlist=sanity-flr,sanity-pfl

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic169eaebd922175467f010b159a2b065fb91b3fb
Reviewed-on: https://review.whamcloud.com/31569
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 fid: move all files from procfs to debugfs 66/28366/10
James Simmons [Sun, 21 Jan 2018 16:55:10 +0000 (11:55 -0500)]
LU-8066 fid: move all files from procfs to debugfs

Linux-commit: f3aa79fbef7942971825fb2084a88e9527c6b04c

Besides the client port form upstream also port the server
side proc entires to debugfs.

Change-Id: I934fc5a39c8c407799abd0d6154240d3a579c93e
Signed-off-by: Dmitry Eremin <dmiter4ever@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28366
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 obd: final pieces for sysfs/debugfs support. 08/28108/24
James Simmons [Thu, 22 Feb 2018 17:26:16 +0000 (12:26 -0500)]
LU-8066 obd: final pieces for sysfs/debugfs support.

This patch puts in place the basics needed for debugfs.
It also creates class_setup_tunables so sysfs kobject
creation is handled for both obd_devices and llite. Add a
special LDEBUGFS_FOPS_WR_ONLY since often in this case
i_private is not set so any attempt to call PDE_DATA(inode)
will cause it to crash. Make lprocfs_obd_setup select either
debugfs or procfs but not both.

Handle the special symlinks needed for both debugfs
and sysfs with the server case. For lod we need to
create "lov" and osp we create "osc" for both sysfs
and debugfs. Handle the complex case of when a node
is both a server and client. For debugfs we can take
advantage of d_lookup() and for sysfs kset_find_obj()
to avoid special access to struct obd_type. This also
places the burden on the server lod/osp modules instead
of the client lov/osc modules.

Change-Id: I87090859db4da2300ab9e2aa3c23cb3773276103
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28108
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10551 lod: obd_fid_alloc() could start a nested trans 68/31268/10
Bobi Jam [Mon, 12 Feb 2018 05:44:48 +0000 (13:44 +0800)]
LU-10551 lod: obd_fid_alloc() could start a nested trans

* obd_fid_alloc() could possibly start a nested transaction, which
  would reset the OI cache. So we add a
  osd_thread_info::oti_ins_cache_depth to prevent clearing OI cache
  in the nested trnasaction.

* Add more debug mesages in osd_idc_find_or_init()/
  osd_idc_find_and_init()

Test-Parameters: alwaysuploadlogs envdefinitions=PTLDEBUG=-1 testlist=sanity-pfl ostfilesystemtype=zfs mdtfilesystemtype=zfs mdscount=2 mdtcount=4
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Id75fd1787ffc0f47bbf110d460f23db6c34670da
Reviewed-on: https://review.whamcloud.com/31268
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10607 osd-zfs: skip io stat for OI scrub 80/31180/5
Fan Yong [Wed, 28 Feb 2018 03:36:52 +0000 (11:36 +0800)]
LU-10607 osd-zfs: skip io stat for OI scrub

It is unnecessary to stat io for OI scrub triggered request.
On the other hand, the OI setup logic may read/write the OI
scrub file. At that time, related lproc (including io stat)
for such OSD is not initialized yet. So this patch skips io
stat for OI scrub.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9498c1351c1875ac9aa46eed5189cb61a6d102ac
Reviewed-on: https://review.whamcloud.com/31180
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5991 obd: fix mount error handing 59/12959/9
Vladimir Saveliev [Thu, 15 Feb 2018 15:40:16 +0000 (18:40 +0300)]
LU-5991 obd: fix mount error handing

lustre_fill_super() allocates lsi and assumes that on failures lsi
will be freed by server_fill_super() or ll_fill_super().
- server_fill_super() does not free lsi when lsi_prepare() fails.
- ll_fill_super() does not free lsi when OBD_ALLOC_PTR(cfg) or
ll_init_sbi() fail.

osd_device_fini() needs osd_index_backup(). Otherwise
struct lustre_index_backup_unit-s leak if server_fill_super() fails
after osd_start().

Cray-bug-id: MRP-2229
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I366dc2b46a504a65b030bcbf687998dd0676f404
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/12959
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10737 misc: Wrong checksum return value 48/31448/2
Qian Yingjin [Wed, 28 Feb 2018 09:22:01 +0000 (17:22 +0800)]
LU-10737 misc: Wrong checksum return value

In the checksum calculation functions: tgt_checksum_niobuf and
osc_checksum_bulk, it is wrongly taken the error return value
of cfs_crypto_hash_init as the checksum value.
This patch fixes the problem.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I647c402deeab00ec5c6437423b0cab250b42c3e5
Reviewed-on: https://review.whamcloud.com/31448
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-4423 lov: use correct env in lov_io_data_version_end() 18/31418/4
NeilBrown [Wed, 28 Feb 2018 02:50:23 +0000 (21:50 -0500)]
LU-4423 lov: use correct env in lov_io_data_version_end()

lov - the logical object volume manager - is responsible for
striping data across multiple volumes.

So when it is given a request, it creates one or more
sub-requests, one for each target volume.  Each sub_io
request has a sub_env environment which it operates in.

When lov_io_data_version_end() calls lov_io_end_wrapper() to
wait for and close off a sub_io, it passes the wrong
environment.

This causes an LINVRNT() to fail in cl2osc_io(), and may
cause other problems.

This patch changes the call to use ->sub_env, much like
other code in the same file.

Change-Id: Id120929f4189196232d18103007e45ba89195fff
Fixes: fcd45488711a (LU-5683 clio: add CIT_DATA_VERSION)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31418
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10421 echo: use echo layer when finding stripe object 38/31338/3
John L. Hammond [Fri, 16 Feb 2018 18:55:05 +0000 (12:55 -0600)]
LU-10421 echo: use echo layer when finding stripe object

In echo_md_dir_stripe_choose(), find the stripe object using the echo
device rather than the down layer (mdd) device. mdd objects are not
equipped to be top layer objects and should not be found in this way.

Test-Parameters: trivial testlist=mds-survey
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ibb396ae64b6d542c64697336d227e06163a0bb39
Reviewed-on: https://review.whamcloud.com/31338
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
2 years agoLU-10662 llite: Add exit for filedata allocation failed 96/31296/3
Ben Evans [Tue, 13 Feb 2018 19:20:18 +0000 (14:20 -0500)]
LU-10662 llite: Add exit for filedata allocation failed

When the filedata allocation fails, we need to exit to
a later point than out_openerr, which calls
deauthorize_statahead and ll_file_data_put, neither of
which is valid. (This leads to a panic.)

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I670d578f01b2731761e3149db36dd8da1551a30a
Cray-bug-id: LUS-1321
Reviewed-on: https://review.whamcloud.com/31296
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10656 ldlm: fix export reference 39/31139/5
Hongchao Zhang [Sun, 28 Jan 2018 19:25:42 +0000 (03:25 +0800)]
LU-10656 ldlm: fix export reference

In ptlrpc_connect_interpert, the export reference could be
leaked if there is error before the following class_exp_put.

Change-Id: I9ddd82fa1bbf8e17079e9746202be63e6233c052
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31139
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10582 out: can't obtain remote acl xattr 88/31088/5
Andriy Skulysh [Tue, 12 Dec 2017 13:39:14 +0000 (15:39 +0200)]
LU-10582 out: can't obtain remote acl xattr

osp_xattr_get() fails due to hardcoded
reply size limitation.

With large_xattr enabled ddp_max_ea_size can be
almost 1MB and out_handles fails to send such big
reply.

Limit maximum ACL buffer size by XATTR_SIZE_MAX.
Limit ddp_max_ea_size to fit resulting reply
request into LNET_MTU.

Cray-bug-id: MRP-4724
Change-Id: I6405330605809911c3f814fe5cb9d476d7ac40ed
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/31088
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9934 build: support for gcc7 10/28810/10
Alex Zhuravlev [Fri, 23 Feb 2018 17:50:12 +0000 (12:50 -0500)]
LU-9934 build: support for gcc7

Supress few false warnings with a compiler option when
building the lustre kernel modules. A few other fixes
to make lustre buildable on Fedora.

Change-Id: If14d226e5d92ae9ce54e216d032df94d9398654e
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/28810
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10465 lov: increase default stripe size to 4MB 51/27151/13
Jian Yu [Tue, 27 Feb 2018 08:28:40 +0000 (00:28 -0800)]
LU-10465 lov: increase default stripe size to 4MB

Increase the default stripe size from 1MB to 4MB
so that widely-striped files can generate full RPCs
without pinning so much memory on the client.

The patch also renames STRIPE_BYTES and STRIPES_PER_OBJ
to DEF_STRIPE_SIZE and DEF_STRIPE_COUNT in cfg/local.sh,
and unsets them to support formatting Lustre filesystem
with default stripe size and count.

Change-Id: I59d1fdb3e30599c125e0e5e800d168921bd69098
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/27151
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9727 mdd: properly call recording_changelog() 56/31456/2
Sebastien Buisson [Wed, 28 Feb 2018 16:18:32 +0000 (01:18 +0900)]
LU-9727 mdd: properly call recording_changelog()

recording_changelog() must be called everywhere in the code instead
of directly checking (mdd->mdd_cl.mc_flags & CLM_ON).

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9ed5aac4871573e6aea94cfd4dc46b95d5df1e4a
Reviewed-on: https://review.whamcloud.com/31456
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6142 UAPI: replace cfs_size_* macros with __ALIGN_KERNEL 79/30379/6
James Simmons [Mon, 26 Feb 2018 16:11:02 +0000 (11:11 -0500)]
LU-6142 UAPI: replace cfs_size_* macros with __ALIGN_KERNEL

The lustre specific cfs_size_* macros can be easily replaced with
the __ALIGN_KERNEL macro provided by the linux kernel for our
user land code. This brings us closer to building against the
upstream client.

Change-Id: I5cd261807f60296eaac884b66f084c128adc5b01
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30379
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9324 lfs: add setstripe --copy=lustre_file_or_dir parameter 79/26879/6
Bobi Jam [Fri, 28 Apr 2017 06:18:47 +0000 (14:18 +0800)]
LU-9324 lfs: add setstripe --copy=lustre_file_or_dir parameter

Add a "lfs setstripe --copy=<lustre_src> <lustre_file_or_dir_dst>"
usage to set stripe using stripe info from a source lustre file/dir.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ibcd80f98c53bdff5b41ba9b1010fceefd6c9d8b7
Reviewed-on: https://review.whamcloud.com/26879
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
2 years agoLU-8910 osp: Add correct handling of errors to osp_statfs_interpret 67/24167/5
Sergey Cheremencev [Tue, 6 Dec 2016 09:16:05 +0000 (12:16 +0300)]
LU-8910 osp: Add correct handling of errors to osp_statfs_interpret

MDT's statfs info could be disagreed with OST's info for a very long time.
If osp_statfs_update() is called and extends the timeout 1000*obd_timeout
into the future but then osp_statfs_interpret() hits an error it
will never reset the timeout.
Now when osp_update_statfs request fails osp_statfs_interpret causes
osp_precreate_cleanup_orphans to send new one after 10 seconds.

Change-Id: Ib282d806ba4932db5c72df34905988f96de99297
Cray-bug-id: MRP-3892
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-on: https://review.whamcloud.com/24167
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10212 test: ESTALE read 01/31101/7
Alexander Boyko [Wed, 31 Jan 2018 11:17:42 +0000 (06:17 -0500)]
LU-10212 test: ESTALE read

The patch reproduces the issue, when a read rpc come
to OST with a lock handle which has the LDLM_FL_DESTROY
flag. And then a client gets the ESTALE error for a read
operation.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: MRP-4604
Change-Id: I0722fc57a61153b25a05bf7aebce5d7f32bbc95b
Reviewed-on: https://review.whamcloud.com/31101
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
2 years agoLU-10653 kernel: kernel update [SLES12 SP2 4.4.114-92.64] 55/31255/5
Bob Glossman [Fri, 9 Feb 2018 18:26:17 +0000 (10:26 -0800)]
LU-10653 kernel: kernel update [SLES12 SP2 4.4.114-92.64]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I98f3784eddc05e4faf00091e05b751d78090f66d
Reviewed-on: https://review.whamcloud.com/31255
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10655 tests: eliminate 'ssh exited with exit code 1' 63/31263/5
Vladimir Saveliev [Sat, 10 Feb 2018 10:21:44 +0000 (13:21 +0300)]
LU-10655 tests: eliminate 'ssh exited with exit code 1'

Eliminate meaningless 'ssh exited with exit code 1' issued by stop()
and wait_exit_ST()

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Elena V. Gryaznova <c17455@cray.com>
Cray-bug-id: MRP-1483
Change-Id: Ie1af3cda0b48b7bf482ea35b84c93e38d0f6c0a9
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/31263
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Jenkins
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10716 tests: skip sanity 56xb for old server 16/31416/2
Elena Gryaznova [Mon, 26 Feb 2018 13:15:20 +0000 (16:15 +0300)]
LU-10716 tests: skip sanity 56xb for old server

Patch skips sanity test_56xb for servers which
do not contain LU-6051.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2609
Test-Parameters: trivial testlist=sanity envdefinitions="ONLY=56bx"
Change-Id: I1b03f5e1c144dedef2bdd7b0c46e431d6761eb47
Reviewed-on: https://review.whamcloud.com/31416
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10712 tests: skip conf-sanity 108[a,b] for old server 13/31413/2
Elena Gryaznova [Mon, 26 Feb 2018 00:59:22 +0000 (03:59 +0300)]
LU-10712 tests: skip conf-sanity 108[a,b] for old server

Patch skips test_108a and test_108b for old servers.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5691
Test-Parameters: trivial testlist=conf-sanity
Change-Id: I7494e15710ac2663a0e01a9d3568fa5bcd590a6a
Reviewed-on: https://review.whamcloud.com/31413
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10684 tests: skip recovery-small 110[h-j] 50/31350/3
Elena Gryaznova [Tue, 20 Feb 2018 15:12:15 +0000 (18:12 +0300)]
LU-10684 tests: skip recovery-small 110[h-j]

Skip test_110h, test_110i. test_110j for old
servers.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2589
Test-Parameters: trivial testlist=recovery-small envdefinitions=ONLY=110
Change-Id: I9d89fcdc55b5d1d1fd4004d8e09e0297eb4bc595
Reviewed-on: https://review.whamcloud.com/31350
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10672 lnet: pass in only time64_t to lnet_notify 39/31339/3
James Simmons [Wed, 21 Feb 2018 18:22:02 +0000 (13:22 -0500)]
LU-10672 lnet: pass in only time64_t to lnet_notify

With the migration to 64 bit second time some calls to lnet_notify
did not get updated to use time64_t. Update those calling points.
Also for the ioctl IOC_LIBCFS_NOTIFY_ROUTER we pass in the number
of seconds since the epoch. We subtract the current epoch time but
we missed adding in the current number of seconds since booting
since the lnet ping code expects the seconds since boot to be used.

Change-Id: I5a92df08cdaf3b747fd17721a92038df05669a81
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31339
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10604 osd: define couple fields as bitfield 97/31097/5
Alex Zhuravlev [Wed, 31 Jan 2018 06:08:02 +0000 (09:08 +0300)]
LU-10604 osd: define couple fields as bitfield

redefine oo_compat_dot_created and oo_compat_dotdot_created
to save 8 bytes per object.

Change-Id: I92dafc693f1d118debc251d7d064b206e36624f0
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31097
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7787 mdd: clean up orphan object handling 47/30547/8
Andreas Dilger [Thu, 14 Dec 2017 21:53:58 +0000 (14:53 -0700)]
LU-7787 mdd: clean up orphan object handling

There was a potential problem in the orphan object naming because
it had an embedded space in the filename before the "operation",
which might cause issues if they are accessed for other reasons.
It turns out that there is no need for the "operation" to be
embedded into the filename, since it was always ORPH_OP_UNLINK.

Use standard DFID formatting for the orphan object names, which
is a bit shorter and more efficient on disk, without the embedded
operation type.

Remove the use of "ORPH_OP_UNLINK" in the code, except in the
compatibility code for handling orphans left over after upgrades
from older Lustre versions.  This can be removed at some point
in the future when there are no longer upgrades from pre-2.11
versions.

Rename the orphan handling functions to start with mdd_orphan_*
for consistency with other MDD functions:
orph_index_init -> mdd_orphan_index_init
orph_index_iterate -> mdd_orphan_index_iterate
orph_index_fini -> mdd_orphan_index_fini
orph_declare_index_insert -> mdd_orphan_declare_insert
orph_declare_index_insert -> mdd_orphan_declare_insert
orph_key_test_and_del -> mdd_orphan_key_test_and_delete
orph_key_fill -> mdd_orphan_key_fill
orph_key_fill_18 -> mdd_orphan_key_fill_20
__mdd_orphan_add -> mdd_orphan_insert
__mdd_orphan_del -> mdd_orphan_delete
__mdd_orphan_cleanup -> mdd_orphan_cleanup_thread

Remove single-line wrapper functions to clarify actual code:
mdd_orphan_write_lock -> dt_write_lock
mdd_orphan_write_unlock -> dt_write_unlock
mdd_orphan_delete_obj -> dt_delete
mdd_orphan_ref_add -> dt_ref_add
mdd_orphan_ref_del -> dt_ref_del

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ica90cc03c3212103c39cba11c4566584bf9cab07
Reviewed-on: https://review.whamcloud.com/30547
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9325 obd: replace lprocfs_str_to_s64 39/30539/16
James Simmons [Fri, 9 Feb 2018 23:46:39 +0000 (18:46 -0500)]
LU-9325 obd: replace lprocfs_str_to_s64

The original goal of lprocfs_str_to_s64[_with_units] was to allow
passing in values of different unit sizes i.e 64K to a proc file.
Their are a few problems with the implementation that prevents its
direct use with sysfs/debugfs. The first problem is that
lprocfs_str_to_s64() was used for a lot of cases where it doesn't
make sense to use it. Often it was used for bool values passing
in or after retrieving a value as signed 64 bit it ensures its in
range of some other unit size. For these cases we can simply move
to kstrtoXXX_from_user(). To handle the case of bool values we
add in supoort for kstrtobool_from_user().

Replace the lprocfs_rd_uint() and lprocfs_wr_uint() generic callbacks
with a simpler, more direct implementation of ldlm_rw_uint_fops.

There's a slight change in lustre debugfs write semantics: Using kstrtox
causes EINVAL when the written number is followed by other (garbage)
characters, whereas previously the garbage would be ignored and such a
write would succeed.

Linux-commit: 8b23093269c84b0da1201e1949c91d0beb9892ef

Change-Id: I39f0ba3dc72685fe6e29c7077f37ad4e69a20b4a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Mathias Rav <mathiasrav@gmail.com>
Reviewed-on: https://review.whamcloud.com/30539
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10260 hsm: enable max archive_id posix copytool 71/30171/6
Thomas Stibor [Mon, 20 Nov 2017 15:36:57 +0000 (16:36 +0100)]
LU-10260 hsm: enable max archive_id posix copytool

The current maximum archive-id in posix copytool is
limited to id < LL_HSM_MAX_ARCHIVE. However, the Lustre HSM
implementation checks as follows for the maximum:
if (id > LL_HSM_MAX_ARCHIVE) then flag ERROR.
Thus the number of archive id's is in the
range 0,1,..,32 = LL_HSM_MAX_ARCHIVE, and therefore
32 = LL_HSM_MAX_ARCHIVE should be included.
Note, archive-id = 0 is reserved to specify to listen to
ANY archive id and is use as default when no archive-id
option is provided.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Thomas Stibor <t.stibor@gsi.de>
Change-Id: I6289c8c0e7d86b05f1f2d821b7f6b3127e5fa352
Reviewed-on: https://review.whamcloud.com/30171
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10244 osc: add a bit to indicate osc_page in cache tree 96/30096/6
Bobi Jam [Wed, 15 Nov 2017 07:02:30 +0000 (15:02 +0800)]
LU-10244 osc: add a bit to indicate osc_page in cache tree

Add osc_page::ops_intree to indicate whether the osc_page is in the
osc_object's cache tree, so that when page cannot insert in the
cache as race happens, the cleanup code won't try to remove it from
the cache.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ifcfe158d10c23a40c116414c7f4f86b257e1fa76
Reviewed-on: https://review.whamcloud.com/30096
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9669 tests: check required nrs availability on a facet 60/27660/7
Elena Gryaznova [Fri, 16 Feb 2018 16:59:34 +0000 (19:59 +0300)]
LU-9669 tests: check required nrs availability on a facet

sanityn/77[abcdefg], 78 failed with interop testing due to
missing nrs policy related proc entry's in OSS/MGS/MDS node.

Fix is to check for availabilty of a required nrs on a facet.
Patch removes tne versions based check from basic NRS policies
regression tests to make the possibility of interop testing with
old servers with NRS feature backported.

Author: Jadhav Vikram <jadhav.vikram@seagate.com>

Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanityn
Cray-bug-id: LUS-5259
Seagate-bug-id: MRP-3999
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Parinay Vijayprakash Kondekar <parinay.kondekar@seagate.com>
Change-Id: If0eca183ac388d481ddb3b1d39e0c9def5dd0c37
Reviewed-on: https://review.whamcloud.com/27660
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9624 tests: fix pre-DNE test exceptions/llog usage 35/27535/31
Andreas Dilger [Thu, 8 Jun 2017 20:27:50 +0000 (14:27 -0600)]
LU-9624 tests: fix pre-DNE test exceptions/llog usage

Remove some test skips when running with multiple MDTs in DNE mode,
or fix tests to work better with multiple MDTs.  Tests updated are:
recovery-small: 60
sanity: 17hi, 154ab, 160abcde, 161abcd, 162a, 205, 225ab, 254, 256

In particular, sanity.sh test_160, test_161, test_162 ignored test
failures in DNE mode.  Fix test_160* to work with ChangeLogs stored
on multiple MDTs.  This adds test coverage both because we aren't
skipping these tests when running in DNE mode, but also because we
are now validating ChangeLogs running on multiple MDTs at once.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity,recovery-small,sanity-hsm
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3fc3ce85b46f34e507c1e28b4c76574a698cab07
Reviewed-on: https://review.whamcloud.com/27535
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8912 nodemap: fix contiguous range support 97/24397/4
Kit Westneat [Thu, 15 Dec 2016 23:45:00 +0000 (07:45 +0800)]
LU-8912 nodemap: fix contiguous range support

This patch fixes the contiguous range check to allow the addition of
multiple "full" ([0-255]) ranges. As part of this change,
is_contiguous and find_min_max are combined as they were always
called together and the logic is fairly similar. This also removes
the multiple range expression support, since it was broken.

Also, sanity-sec.sh test_10c is added to verify this patch.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I3c49a077039327fcbde87196f82db140f67a74d0
Reviewed-on: https://review.whamcloud.com/24397
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8672 tests: Fix error handling in replay-single test_89 74/22974/7
Abrarahmed Momin [Thu, 22 Feb 2018 16:50:16 +0000 (19:50 +0300)]
LU-8672 tests: Fix error handling in replay-single test_89

Update replay-single test_89() to error out on wait_mds_ost_sync and
wait_delete_completed timeout.

Correct error handling in wait_delete_completed_mds and
wait_delete_completed.

Signed-off-by: Abrarahmed Momin <abrar.habib@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Cray-bug-id: MRP-1680
Test-Parameters: trivial
Change-Id: I54e30221361e73a17ba857cb19b1efcc019b412f
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/22974
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>