Whamcloud - gitweb
fs/lustre-release.git
14 years agob=16909 Suppress "changing the import ..." warning.
Brian Behlendorf [Fri, 15 Jan 2010 17:33:42 +0000 (09:33 -0800)]
b=16909 Suppress "changing the import ..." warning.

This warning will always be printed when the MDT reconnects to an
OST after the MDT is restarted.  There is nothing wrong here and
more importantly there is nothing the admin should do or care
about so I'm moving the warning to D_HA.

  Lustre: 9099:0:(llog_net.c:175:llog_receptor_accept())
  changing the import ffff810236ad8800 - ffff8102050a9800

14 years agob=16909 Use INFO/WARN instead of WARN/ERROR for the slow messages.
Brian Behlendorf [Fri, 19 Feb 2010 21:46:32 +0000 (13:46 -0800)]
b=16909 Use INFO/WARN instead of WARN/ERROR for the slow messages.

We should use INFO/WARN instead of WARN/ERROR for the slow messages.
Not only is there no real error here but it fixes an annoying quirk
of the message formatting.  With the old levels you would see the
messages formatted differently based on the time.

  Lustre: lc1-OST0001: slow parent lock 289s due to heavy IO load
  LustreError: 0-0: lc1-OST0001: slow parent lock 324s due to heavy IO load

With the new levels things are more consistent.

  Lustre: lc1-OST0001: slow parent lock 289s due to heavy IO load
  Lustre: lc1-OST0001: slow parent lock 324s due to heavy IO load

14 years agob=22385 Computing result of unsigned variable may < 0.
yangsheng [Tue, 6 Apr 2010 15:37:21 +0000 (23:37 +0800)]
b=22385 Computing result of unsigned variable may < 0.

i=johann
i=wangdi

14 years agobump version to 1.8.2.57
Johann Lombardi [Tue, 6 Apr 2010 10:47:22 +0000 (12:47 +0200)]
bump version to 1.8.2.57

14 years agob=22252 allow multiple instances of the same nid in NID hash
Oleg Drokin [Tue, 6 Apr 2010 09:44:30 +0000 (11:44 +0200)]
b=22252 allow multiple instances of the same nid in NID hash

i=robert
i=johann

Case of multiple separate clients from the same NID (as with liblustre) is
legitimate and so we should allow multiple instances of the same NID in nid
hash.

14 years agob=22423 add regression test for reconnect flooding issue
Johann Lombardi [Tue, 6 Apr 2010 09:01:33 +0000 (11:01 +0200)]
b=22423 add regression test for reconnect flooding issue

i=dmitry

14 years agob=22423 rely on pings to issue reconnects
Johann Lombardi [Tue, 6 Apr 2010 08:56:23 +0000 (10:56 +0200)]
b=22423 rely on pings to issue reconnects

i=nathan
i=dmitry

Don't wake up pinger on reconnect failures and rely on
regular pings to trigger the next reconnection.
Please note that the pinger already uses a smaller interval
if the import is disconnected.

14 years agob=20615 print more debug info for timedout ZC-req
Liang Zhen [Tue, 6 Apr 2010 08:48:51 +0000 (10:48 +0200)]
b=20615 print more debug info for  timedout ZC-req

i=maxim
i=isaac

1. output more information for timedout ZC-req and partial received connection
2. close connection for timedout ZC-req
3. always send ZC_ACK on non-blocking connection(BULK_IN)

14 years agob=22307 remove lock acquisition during holding spinlock
hongchao.zhang [Thu, 1 Apr 2010 12:08:51 +0000 (20:08 +0800)]
b=22307 remove lock acquisition during holding spinlock

in ras_update, "lov_get_info" could be called during increasing
readahead windows, which tries to get the mutex lock "lov_lock"
while holding the spin_lock "ras_lock",  then causes system lockup.

i=johann@sun.com
i=tom.wang@sun.com

14 years agob=22301 lustre.lov error when backing up symlinks with extended attributes
Dmitry Zogin [Tue, 6 Apr 2010 00:08:28 +0000 (20:08 -0400)]
b=22301 lustre.lov error when backing up symlinks with extended attributes

 sanity test_17k created

 o=grev
 i=dmitry.zogin

14 years agob=19919 Test case for verify pool works well on relative path.
yangsheng [Mon, 5 Apr 2010 16:45:49 +0000 (00:45 +0800)]
b=19919 Test case for verify pool works well on relative path.

i=johann

14 years agob=22137 kernel oops at replay-single test_61d.
Dmitry Zogin [Fri, 2 Apr 2010 15:28:19 +0000 (11:28 -0400)]
b=22137 kernel oops at replay-single test_61d.

 replay-single.sh test_61d was modified to operate with MGS in case of
 the different MGS and MDS.

 i=grev

14 years agob=20278 ASSERTION(cli->cl_avail_grant >= 0) failed
Cliff White [Fri, 26 Mar 2010 05:42:51 +0000 (22:42 -0700)]
b=20278 ASSERTION(cli->cl_avail_grant >= 0) failed

i=tom.wang
i=robert.read

This patch tries to address several issues:
- osc_init_grant(): calculate avail_grant according to recovery status.
- osc_reconnect(): request grant should include cl_dirty.
- filter_grant(): beside server reboot, we should also grant the requested
  amount in case of normal reconnect.
- round-up grant amount instead of round-down, otherwise client would still
  have situation that dirty > granted.

14 years agob=20805 Use CNETERR in specific places in the portal's LNET driver
James Simmons [Fri, 2 Apr 2010 21:06:38 +0000 (23:06 +0200)]
b=20805 Use CNETERR in specific places in the portal's LNET driver

i=isaac
i=liang

14 years agobump version to 1.8.2.56
Johann Lombardi [Fri, 2 Apr 2010 10:52:08 +0000 (12:52 +0200)]
bump version to 1.8.2.56

14 years agob=22074 Incorrect triggering of synchronous IO by OSC
Dmitry Zogin [Wed, 31 Mar 2010 14:24:13 +0000 (10:24 -0400)]
b=22074 Incorrect triggering of synchronous IO by OSC

 Sanity.sh test_42e was added. Backport from 2.0

14 years agob=22194 sanity-quota cleanups
Andrew Perepechko [Wed, 31 Mar 2010 14:06:43 +0000 (18:06 +0400)]
b=22194 sanity-quota cleanups

This is the missing part of the original patch (att29160)

i=ZhiYong Tian

14 years agob=22108 include last created object in precreate slow case
Landen [Tue, 30 Mar 2010 08:27:10 +0000 (16:27 +0800)]
b=22108 include last created object in precreate slow case

i=andreas.dilger
i=inspection
i=landen

14 years agobump version to 1.8.2.55 v1_8_2_55
Johann Lombardi [Fri, 26 Mar 2010 23:54:51 +0000 (00:54 +0100)]
bump version to 1.8.2.55

14 years agob=21927 replay-vbr more 1.8 <-> 20 interop changes
Elena Gryaznova [Fri, 26 Mar 2010 17:59:38 +0000 (20:59 +0300)]
b=21927 replay-vbr more 1.8 <-> 20 interop changes

i=Mikhail.Pershin

14 years agob=20373 don't do rep-ack if not created anything
Oleg Drokin [Thu, 25 Mar 2010 22:40:32 +0000 (18:40 -0400)]
b=20373 don't do rep-ack if not created anything

i=johann
i=Z

mds_open currently always put a lock into a rep-ack regardless if something
was created or not. This is pointless and only creates needless contention.
In fact the entire idea was to do this for real creates as a recovery protection.

14 years agob=22409 Spurious error messages from smp_processor_id() on preemptible kernel
Dmitry Zogin [Thu, 25 Mar 2010 16:53:32 +0000 (12:53 -0400)]
b=22409 Spurious error messages from smp_processor_id() on preemptible kernel

 Disable a preemption by grabbing the lock in fs_trace_get_tcd() first.
 The function fs_trace_get_tcd() was moved up.

 o=andreas.dilger
 i=johann
 i=dmitry.zogin
 i=nathan.rutman

14 years agobump version to 1.8.2.54
Johann Lombardi [Thu, 25 Mar 2010 13:51:03 +0000 (14:51 +0100)]
bump version to 1.8.2.54

14 years agob=22194 tiny sanity-quota cleanup
Andrew Perepechko [Wed, 24 Mar 2010 16:24:32 +0000 (19:24 +0300)]
b=22194 tiny sanity-quota cleanup

i=ZhiYong Tian

14 years agob=21500 2.6.31-fc12 patchless client support.
yangsheng [Wed, 24 Mar 2010 16:24:48 +0000 (00:24 +0800)]
b=21500 2.6.31-fc12 patchless client support.

i=brian
i=rahul
i=adilger

14 years agob=17258 give the BUILD_TESTS love to ldiskfs as well
Brian J. Murrell [Wed, 24 Mar 2010 16:04:57 +0000 (12:04 -0400)]
b=17258 give the BUILD_TESTS love to ldiskfs as well

Because ldiskfs re-uses so (too?) much of the lustre auto* goop we need
to stub the BUILD_TESTS assignment into it's autoMakefile.am, even though
it's completely unused/unneed there.

i=wangyb
i=yangsheng

14 years agob=22181 interval_erase() fix
Vitaly Fertman [Wed, 24 Mar 2010 15:06:11 +0000 (18:06 +0300)]
b=22181 interval_erase() fix

i=green
i=johann

interval_erase() calls update_maxhigh() properly when child == NULL

14 years agob=21945 Adding WIRE_ATTR attribute to LNET types
Maxim Patlasov [Wed, 24 Mar 2010 12:05:25 +0000 (15:05 +0300)]
b=21945 Adding WIRE_ATTR attribute to LNET types

i=liang
LST nodes on different platforms might not communicate well due to the lack of WIRE_ATTR attribute in some LNET structures traversing network. The patch fixes the problem by adding WIRE_ATTR where needed.

14 years agob=22069 replace server_major_version with connect_flags for quota utils interoperability
Fan Yong [Wed, 24 Mar 2010 03:38:42 +0000 (11:38 +0800)]
b=22069 replace server_major_version with connect_flags for quota utils interoperability

replace server_major_version with connect_flags for quota utils interoperability.

i=johann
i=landen

14 years agob=22233 do_div arguments not cross-platform compatible
Wang Di [Tue, 23 Mar 2010 12:53:07 +0000 (08:53 -0400)]
b=22233 do_div arguments not cross-platform compatible

o=Christopher Morrone
i=adilger
i=wangdi

14 years agob=22177 fix error message in mds_mfd_close()
Johann Lombardi [Tue, 23 Mar 2010 15:00:13 +0000 (16:00 +0100)]
b=22177 fix error message in mds_mfd_close()

i=adilger

Fix error messages in mds_mfd_close() since it is now
legitimate to have i_nlink = 1 for dirs in /PENDING.

14 years agoRevert "b=20433 decrease the usage of memory on clients."
Johann Lombardi [Mon, 22 Mar 2010 22:14:45 +0000 (23:14 +0100)]
Revert "b=20433 decrease the usage of memory on clients."

Suspected to cause bug 22307, so revert temporarily.

This reverts commit 841fbac6378df39e357342d86d9380e6676c1faf.

14 years agob=22307 rate limit dlm debug message in ll_inode_from_lock()
Johann Lombardi [Mon, 22 Mar 2010 22:01:55 +0000 (23:01 +0100)]
b=22307 rate limit dlm debug message in ll_inode_from_lock()

i=oleg
i=dmitry

14 years agob=22306 more sanity-quota 18 <-> 20 interop changes
Elena Gryaznova [Fri, 19 Mar 2010 19:27:44 +0000 (22:27 +0300)]
b=22306 more sanity-quota 18 <-> 20 interop changes

i=Yong.Fan
i=Johann.Lombardi

skip test_15, test_16
test_2, test_25 mds/mdt device fix

14 years agob=22386 quota_save_version() fix: remove wildcard from conf_param
Elena Gryaznova [Fri, 19 Mar 2010 16:02:24 +0000 (19:02 +0300)]
b=22386 quota_save_version() fix: remove wildcard from conf_param

i=Andrew.Perepechko

test_22 18 <-> 20 interop: skip quota v1 testing

14 years agob=22327 "lfs df" does not print stats for all mountpoints
Dmitry Zogin [Thu, 18 Mar 2010 18:48:05 +0000 (14:48 -0400)]
b=22327 "lfs df" does not print stats for all mountpoints

 Added sanityN.sh test_24b
 i=andreas.dilger

14 years agob=22327 "lfs df" does not print stats for all mountpoints
Dmitry Zogin [Tue, 16 Mar 2010 18:57:20 +0000 (14:57 -0400)]
b=22327 "lfs df" does not print stats for all mountpoints

  Print all mounted lustre filesystems with "lfs df"

 o=adilger
 i=simmonsja
 i=dmitry.zogin

14 years agoUpdated version for build 03. v1_8_2_53
Terry Rutledge [Fri, 19 Mar 2010 15:43:45 +0000 (09:43 -0600)]
Updated version for build 03.

14 years agob=21957 debug_mb not correctly initialized on newer kernels (2.6.31)
Rahul Deshmukh [Fri, 19 Mar 2010 09:05:29 +0000 (14:35 +0530)]
b=21957 debug_mb not correctly initialized on newer kernels (2.6.31)

i=adilger
i=rread

Fixed the debug_mb initialization problem for kernel 2.6.31

14 years agob=19919 support relative path in llapi_search_fsname()
yangsheng [Fri, 19 Mar 2010 06:49:24 +0000 (14:49 +0800)]
b=19919 support relative path in llapi_search_fsname()

i=adilger
i=johann

Use realpath() to provide absolute pathname.

14 years agob=21486 fix for truncated reply buffer
Liang Zhen [Fri, 19 Mar 2010 08:26:09 +0000 (01:26 -0700)]
b=21486 fix for truncated reply buffer

i=eeb
i=ericm

reply buffer could be referred by reply_in_callback after released

14 years agob=21251 Add lustre/tests/ha.sh
Li Wei [Fri, 19 Mar 2010 02:17:34 +0000 (10:17 +0800)]
b=21251 Add lustre/tests/ha.sh

This is a simple failover test script that works with configurations
controlled by a CRM and have multiple targets per server.

i=robert.read
i=grev

14 years agob=21991 sanity test_53, test_67* interop 18 <-> 20 fix
Elena Gryaznova [Thu, 18 Mar 2010 20:28:51 +0000 (23:28 +0300)]
b=21991 sanity test_53, test_67* interop 18 <-> 20 fix

i=Andreas.Dilger

14 years agob=22180 fix the incorrect MDSDEV check
Elena Gryaznova [Wed, 17 Mar 2010 11:59:49 +0000 (14:59 +0300)]
b=22180 fix the incorrect MDSDEV check

i=Nathan.Rutman

new t-f is_blkdev ()
check MDSDEV on mds instead of local client
test_17, test_18 changes for config mgs and mds are not combined

14 years agob=22194 Add quiet -q option to lfs quota
Andrew Perepechko [Tue, 16 Mar 2010 17:02:44 +0000 (20:02 +0300)]
b=22194 Add quiet -q option to lfs quota

o=Joseph Herring (LLNL)
i=Johann Lombardi
i=ZhiYong Tian

14 years agob=21619 hash MEs on RDMA portal
Liang Zhen [Tue, 16 Mar 2010 20:14:58 +0000 (13:14 -0700)]
b=21619 hash MEs on RDMA portal

i=isaac
i=maxim

RDMA portal can have very long ME list on client side, which will trigger
soft lockup because of long searching on list. Hash MEs on RDMA portal can
resolve this problem.

14 years agob=21259 udev rule to set /dev/obd perms 666
Christopher J. Morrone [Tue, 16 Mar 2010 15:12:19 +0000 (11:12 -0400)]
b=21259 udev rule to set /dev/obd perms 666

Provide Udev rules file for Lustre, so that /dev/obd permissions are now 666.

o=Christopher Morrone
i=johann
i=dmitry.zogin

14 years agob=22301 lustre.lov error when backing up symlinks with extended attributes
Dmitry Zogin [Tue, 16 Mar 2010 14:59:39 +0000 (10:59 -0400)]
b=22301 lustre.lov error when backing up symlinks with extended attributes

 Improved logic in ll_listxattr()

 i=tom.wang
 i=dmitry.zogin

14 years agob=21991 interop 18 <-> 20 changes
Elena Gryaznova [Tue, 16 Mar 2010 14:49:30 +0000 (17:49 +0300)]
b=21991 interop 18 <-> 20 changes

i=Andreas.Dilger

14 years agob=22187 Test case for setfattr without a value parameter.
yangsheng [Tue, 16 Mar 2010 14:03:57 +0000 (22:03 +0800)]
b=22187 Test case for setfattr without a value parameter.

i=johann

14 years agob=22187 properly handle null value for setattr -n lustre.lov
yangsheng [Tue, 16 Mar 2010 13:58:48 +0000 (21:58 +0800)]
b=22187 properly handle null value for setattr -n lustre.lov

i=adilger
i=johann

Running "setfattr -n trusted.lov ." causes a NULL dereference
in ll_setxattr() due to no checking if "value" is NULL.
This command now resets to the default striping when
executed against a directory.

14 years agob=21815 tiny fix for 1.8<->2.0 interop testing
Johann Lombardi [Tue, 16 Mar 2010 13:41:35 +0000 (14:41 +0100)]
b=21815 tiny fix for 1.8<->2.0 interop testing

i=grev

14 years agob=22319 skip statahead for NFSCLIENT
Elena Gryaznova [Tue, 16 Mar 2010 13:25:01 +0000 (16:25 +0300)]
b=22319 skip statahead for NFSCLIENT

i=Johann.Lombardi

14 years agob=22352 Kernel update for SLES9 2.6.5-7.322.
yangsheng [Tue, 16 Mar 2010 13:03:38 +0000 (21:03 +0800)]
b=22352 Kernel update for SLES9 2.6.5-7.322.

i=johann

14 years agob=22194 lfs quota output cleanup
Andrew Perepechko [Mon, 15 Mar 2010 11:55:38 +0000 (14:55 +0300)]
b=22194 lfs quota output cleanup

Suppress standard output in error cases

o=Joseph Herring  (LLNL)
i=Johann Lombardi
i=Andrew Perepechko

14 years agob=22235 llapi_uuid_match() prints bogus error message on upgraded filesystem
Dmitry Zogin [Fri, 12 Mar 2010 22:26:57 +0000 (17:26 -0500)]
b=22235 llapi_uuid_match() prints bogus error message on upgraded filesystem

 1. Increase the "lfs df" column width to handle TB sized devices cleanly
 2. Allow matching OST names without trailing _UUID
 3. Allow negating the "--obd" option to "lfs find"
 4. Remove duplicate code in mntdf() iterating over MDTs/OSTs. Handle errors

 i=nathan.rutman
 i=dmitry.zogin

14 years agob=22306 t-f: interop 18 <-> 20 ENABLE_QUOTA changes
Elena Gryaznova [Fri, 12 Mar 2010 17:33:06 +0000 (20:33 +0300)]
b=22306 t-f: interop 18 <-> 20 ENABLE_QUOTA changes

i=Yong.Fan

14 years agob=21927 test_80 interop 18 <-> 20 fix
Elena Gryaznova [Fri, 12 Mar 2010 16:44:06 +0000 (19:44 +0300)]
b=21927 test_80 interop 18 <-> 20 fix

i=Mikhail.Pershin

14 years agob=21927 test_61 interop 18 <-> 20 fix
Elena Gryaznova [Fri, 12 Mar 2010 16:34:13 +0000 (19:34 +0300)]
b=21927 test_61 interop 18 <-> 20 fix

i=Mikhail.Pershin

interop interop 18 <-> 20 changes
improve awk pattern to get objid correctly

14 years agob=21991 interop 18 <-> 20 fix: skip obdecho test
Elena Gryaznova [Fri, 12 Mar 2010 16:20:39 +0000 (19:20 +0300)]
b=21991 interop 18 <-> 20 fix: skip obdecho test

i=Andreas.Dilger

14 years agob=21815 move test to replay-single
Johann Lombardi [Tue, 16 Mar 2010 13:20:40 +0000 (14:20 +0100)]
b=21815 move test to replay-single

14 years agobump version to 1.8.2.52 v1_8_2_52
Johann Lombardi [Fri, 12 Mar 2010 14:47:12 +0000 (15:47 +0100)]
bump version to 1.8.2.52

14 years agob=22241 call sync instead of fsync on local cancel to reduce stack usage
Johann Lombardi [Fri, 12 Mar 2010 14:28:41 +0000 (15:28 +0100)]
b=22241 call sync instead of fsync on local cancel to reduce stack usage

i=oleg
i=andreas

sync_on_lock_cancel is needed for recovery when async journal is enabled,
but we actually just need to make sure that metadata blocks have hit the
journal, so doing a fs sync should be enough and should consume less
stack (just create an empty handle and commmit it).

14 years agob=21686 simplify client disconnect code on server side
Johann Lombardi [Fri, 12 Mar 2010 14:17:49 +0000 (15:17 +0100)]
b=21686 simplify client disconnect code on server side

o=liang
i=johann
i=shadow

attach 25564
This patch was reverted because we were chasing some regression.
It is now safe to re-apply.

14 years agob=20837 incomplete test output for ost-pools
Manoj Joseph [Thu, 11 Mar 2010 22:39:56 +0000 (15:39 -0700)]
b=20837 incomplete test output for ost-pools

i=nathan.rutman
i=grev

Instead of creating many 10M files to fill the OST, create 9 files
of size OST_SIZE/10 each.

14 years agob=19917 Repeated atomic allocation failures
Dmitry Zogin [Thu, 11 Mar 2010 17:20:31 +0000 (12:20 -0500)]
b=19917 Repeated atomic allocation failures

 Comment change.

 o=he.huang
 i=dmitry.zogin

14 years agob=22035 workaround patch
hongchao.zhang [Mon, 8 Mar 2010 10:15:12 +0000 (18:15 +0800)]
b=22035 workaround patch

disable the per-thread data (current->journal_info)
containing the lock info during I/O to work around
the issue for short tem

i=hongchao.zhang@sun.com

14 years agob=21927 recovery-small, replay-single 18 <-> 20 interop fix
Elena Gryaznova [Wed, 10 Mar 2010 19:12:26 +0000 (22:12 +0300)]
b=21927 recovery-small, replay-single 18 <-> 20 interop fix

i=Mikhail.Pershin

14 years agob=21927 replay-vbr 18 <-> 20 interop fix
Elena Gryaznova [Wed, 10 Mar 2010 19:03:54 +0000 (22:03 +0300)]
b=21927 replay-vbr 18 <-> 20 interop fix

i=Mikhail.Pershin

14 years agob=22194 Print a dash in empty lfs quota grace columns
Andrew Perepechko [Wed, 10 Mar 2010 18:59:50 +0000 (21:59 +0300)]
b=22194 Print a dash in empty lfs quota grace columns

Polish lfs quota output for easier processing with awk/sed

o=Christopher Morrone (LLNL)
i=Andrew Perepechko
i=ZhiYong Tian

14 years agob=21927 t-f: use the global variables for facets mount points
Elena Gryaznova [Wed, 10 Mar 2010 18:49:36 +0000 (21:49 +0300)]
b=21927 t-f: use the global variables for facets mount points

i=Andreas.Dilger

14 years agob=21938 rq_invalid_rqset should be a bitfield
Johann Lombardi [Wed, 10 Mar 2010 22:50:44 +0000 (23:50 +0100)]
b=21938 rq_invalid_rqset should be a bitfield

14 years agob=21815 Test case for clear stale nid-stats hash.
yangsheng [Wed, 10 Mar 2010 16:44:32 +0000 (00:44 +0800)]
b=21815 Test case for clear stale nid-stats hash.

i=johann
i=wang.yibin

14 years agob=19933 control DCACHE_LUSTRE_INVALID flag with MDS_INODELOCK_LOOKUP lock
Fan Yong [Wed, 10 Mar 2010 16:11:51 +0000 (00:11 +0800)]
b=19933 control DCACHE_LUSTRE_INVALID flag with MDS_INODELOCK_LOOKUP lock

"DCACHE_LUSTRE_INVALID" is controlled by "MDS_INODELOCK_LOOKUP" lock which is corresponding to "IT_LOOKUP", do not skip invalidate for other intent.

i=robert.read
i=johann

14 years agob=20997 Cannot send after transport shutdown
Dmitry Zogin [Wed, 10 Mar 2010 15:13:45 +0000 (10:13 -0500)]
b=20997 Cannot send after transport shutdown

 Clear imp_vbr_failed flag upon eviction

 i=robert.read
 i=alexander.zarochentsev

14 years agob=21938 use req->rq_set itself during recovery
hongchao.zhang [Sun, 7 Mar 2010 06:52:55 +0000 (14:52 +0800)]
b=21938 use req->rq_set itself during recovery

during recovery, uses req->rq_set itself to replay the request
instead of ptlrpcd_recovery_pc

i=tappro@sun.com
i=johann@sun.com

14 years agob=22069 introduce server major version for b1_8 and b2_0 quota utils interoperability
Fan Yong [Wed, 10 Mar 2010 02:51:44 +0000 (10:51 +0800)]
b=22069 introduce server major version for b1_8 and b2_0 quota utils interoperability

Introduce server major version for b1_8 and b2_0 quota utils interoperability.

i=andrew.perepechko
i=robert.read

14 years agob=21983 Use CFS_ALLOC_IO instead of _STD in llap_from_page_with_lockh
Dmitry Zogin [Tue, 9 Mar 2010 14:48:23 +0000 (09:48 -0500)]
b=21983 Use CFS_ALLOC_IO instead of _STD in llap_from_page_with_lockh

During an ll_readahead under ll_readpage, we have seen the the
OBD_SLAB_ALLOC hang under ldlm_pools_shrink when trying to lock
a page that is already locked by the readahead code.

Using CFS_ALLOC_IO instead of CFS_ALLOC_STD will prevent
ldlm_pools_shrink from actually freeing slab, so the call path
that blocks indefinitely can never happen.

 i=adilger
 i=dmitry.zogin
 i=johann

14 years agob=22177 inc nlink by 2 instead of 1 in mds_orphan_add_link()
Johann Lombardi [Fri, 5 Mar 2010 22:17:53 +0000 (23:17 +0100)]
b=22177 inc nlink by 2 instead of 1 in mds_orphan_add_link()

i=adilger
i=dmitry

Fix regression introduced by 19640.
ext3_inc_count() can reset nlink to 1 when the directory is indexed and
inode->i_nlink == 2. Work around the problem by incrementing nlink by 2
instead of 1.

14 years agob=17591 sanity-benchmark s/MOUNT/DIR/ cleanup
Elena Gryaznova [Fri, 5 Mar 2010 20:07:53 +0000 (23:07 +0300)]
b=17591 sanity-benchmark s/MOUNT/DIR/ cleanup

i=Robert.Read

14 years agob=22169 t-f:start_client_loads () wait the background threads to start
Elena Gryaznova [Fri, 5 Mar 2010 19:54:25 +0000 (22:54 +0300)]
b=22169 t-f:start_client_loads () wait the background threads to start

i=Robert.Read

14 years agob=22169 t-f cleanup: new do_nodev (), do_nodesv () functions
Elena Gryaznova [Fri, 5 Mar 2010 19:51:38 +0000 (22:51 +0300)]
b=22169 t-f cleanup: new do_nodev (), do_nodesv () functions

i=Robert.Read

14 years agob=22095 MDS operations hang when issued with lfs setstripe on a degraded OST
Dmitry Zogin [Thu, 4 Mar 2010 18:03:14 +0000 (13:03 -0500)]
b=22095 MDS operations hang when issued with lfs setstripe on a degraded OST

 Change the locking order in mds_lookup()

 o=jfilizetti@sms-fed.com
 i=johann
 i=adilger

14 years agob=21900 ost-pools test_25: FAIL
Dmitry Zogin [Thu, 4 Mar 2010 16:16:05 +0000 (11:16 -0500)]
b=21900 ost-pools test_25: FAIL

 Make ost-pools test_25 more robust

 i=manoj.joseph

14 years agob=22127 lustre 1.8.2 lfs permissions Patch corrects cfs_curproc_euid() logic.
Dmitry Zogin [Thu, 4 Mar 2010 02:59:50 +0000 (21:59 -0500)]
b=22127 lustre 1.8.2 lfs permissions Patch corrects cfs_curproc_euid() logic.

 o=bschubert@ddn.com
 i=oleg.drokin
 i=johann

14 years agob=21066 ost-pools test_14 should not assert that files are from a specific OST
Manoj Joseph [Wed, 3 Mar 2010 22:32:21 +0000 (15:32 -0700)]
b=21066 ost-pools test_14 should not assert that files are from a specific OST

Round-robin allocation test should not assert that files are allocated
in strict round-robin fashion.

i=nathan.rutman
i=grev

14 years agob=17258 fix error with make rpms after configure --disable-tests
Brian J. Murrell [Wed, 3 Mar 2010 16:51:40 +0000 (11:51 -0500)]
b=17258 fix error with make rpms after configure --disable-tests

If one configures lustre with "--disable-tests" a subsequent "make rpms"
will fail as it would still try to package up the lustre-tests RPM.
Fixing this provided the opportunity to fix another wart, that being the
subst'ing the configure arguments into the lustre.spec.  Now they are
passed as value with "--define 'configure_args ...'" when calling rpmbuild.

14 years agob=21726 stop waitting for next replay transno if shutdown
hongchao.zhang [Sun, 28 Feb 2010 23:30:04 +0000 (07:30 +0800)]
b=21726 stop waitting for next replay transno if shutdown

if the system is shutting down, wake up service thread blocked
to wait for next replay transno during recovery, then all the
references held by queued requests can be dropped and device
can be stopped.

i=hongchao.zhang@sun.com
i=tappro@sun.com

14 years agob=20101 lfs getstripe -d test for sanity 27w
yangsheng [Tue, 2 Mar 2010 15:32:26 +0000 (23:32 +0800)]
b=20101 lfs getstripe -d test for sanity 27w

i=adilger
i=robert

14 years agob=19873 sanity: Memory leaks detected, FAILed to clean up
Dmitry Zogin [Tue, 2 Mar 2010 13:58:41 +0000 (08:58 -0500)]
b=19873 sanity: Memory leaks detected, FAILed to clean up

 Patch backport from bz 20650, attachment 26416 - introduce lprocfs counter on IRQs
 The lc_sum_irq counter is used to calculate memory freed on the interrupt.

 i=adilger
 i=andrew.perepechko

14 years agobump version to 1.8.2.51
Johann Lombardi [Mon, 1 Mar 2010 23:03:59 +0000 (00:03 +0100)]
bump version to 1.8.2.51

14 years agob=17197 fix typo for OBD_CALC_STRIPE_RPC_END_ALIGN fix typo in OBD_CALC_STRIPE_RPC_EN...
Wang Di [Mon, 1 Mar 2010 16:56:20 +0000 (11:56 -0500)]
b=17197 fix typo for OBD_CALC_STRIPE_RPC_END_ALIGN fix typo in OBD_CALC_STRIPE_RPC_END_ALIGN and do not aligned 1M for stride readahead.

i=ericm
i=johann

14 years agob=21816 return approximate block/inode usage when OSTs are down
Andrew Perepechko [Mon, 1 Mar 2010 16:27:44 +0000 (19:27 +0300)]
b=21816 return approximate block/inode usage when OSTs are down

Really return approximate block/inode usage when OSTs are down.
The old version erroneously skipped oqctl copying on error which
prevented this from working properly.

i=Johann Lombardi
i=ZhiYong Tian

14 years agob=20989 lov_merge_lvb()) ASSERTION(spin_is_locked(&lsm->lsm_lock)) failed
Dmitry Zogin [Mon, 1 Mar 2010 13:39:47 +0000 (08:39 -0500)]
b=20989 lov_merge_lvb()) ASSERTION(spin_is_locked(&lsm->lsm_lock)) failed

 Protect lli->lli_smd pointer updates with lli->lli_lock.

 o=oleg.drokin
 i=johann
 i=dmitry.zogin

14 years agob=21815 Avoid operating lustre-hash internal structures directly.
yangsheng [Mon, 1 Mar 2010 13:24:33 +0000 (21:24 +0800)]
b=21815 Avoid operating lustre-hash internal structures directly.

i=johann
i=nathan

14 years agob=22097 mount.lustre fails to pass some options to mount()
Johann Lombardi [Fri, 26 Feb 2010 21:38:02 +0000 (22:38 +0100)]
b=22097 mount.lustre fails to pass some options to mount()

i=yangsheng
i=dmitry

14 years agob=18649 set wait_recovery_complete() MAX value to max recovery time estimated
Elena Gryaznova [Fri, 26 Feb 2010 19:12:28 +0000 (22:12 +0300)]
b=18649 set wait_recovery_complete() MAX value to max recovery time estimated

i=Mikhail.Pershin

14 years agob=21992 sanity-quota interop: proc path fix for 2.0 servers
Elena Gryaznova [Fri, 26 Feb 2010 16:19:30 +0000 (19:19 +0300)]
b=21992 sanity-quota interop: proc path fix for 2.0 servers

i=Johann.Lombardi

14 years agob=21255 parallel-scale statahead test fix
Elena Gryaznova [Fri, 26 Feb 2010 16:06:28 +0000 (19:06 +0300)]
b=21255 parallel-scale statahead test fix

i=Vladimir.Saveliev
i=Andrew.Perepechko

use mpi for create/delete files instead of createmany and rm

14 years agob=21380 make dist seems to exclude the "darwin" bits
Brian J. Murrell [Thu, 25 Feb 2010 17:59:50 +0000 (12:59 -0500)]
b=21380 make dist seems to exclude the "darwin" bits

Include all of the darwin bits in the distribution tarball created with
make dist.

i=adilger