Whamcloud - gitweb
cliffw [Thu, 13 Oct 2005 22:12:10 +0000 (22:12 +0000)]
b=9508
r=adilger@clusterfs.com
Reverted 2.4 kernels
cliffw [Thu, 13 Oct 2005 21:53:49 +0000 (21:53 +0000)]
b=9508
r=adilger@clusterfs.com
When both gcc33 and gcc32 are available, we should use gcc33
nikita [Thu, 13 Oct 2005 14:39:20 +0000 (14:39 +0000)]
move misplaced comment.
ericm [Thu, 13 Oct 2005 05:23:42 +0000 (05:23 +0000)]
branch: b1_4
land b1_4_xattr: support manipulating user extended attributes.
nikita [Wed, 12 Oct 2005 20:46:16 +0000 (20:46 +0000)]
cleanup llap_from_page():
- add explicit LLAP_FROM_REMOVEPAGE
- make llap_from_page() static
- fix inverted condition in llap_from_page().
b=5047
r=adilger
nathan [Wed, 12 Oct 2005 20:45:30 +0000 (20:45 +0000)]
Branch b1_4
b=9445
nikita [Wed, 12 Oct 2005 19:02:39 +0000 (19:02 +0000)]
remove unneeded conditional compilation wrappers.
nic [Wed, 12 Oct 2005 18:11:47 +0000 (18:11 +0000)]
b=7047
p=adilger
r=nic
fix typo that was preventing lconf --abort_recovery from working
nic [Wed, 12 Oct 2005 17:58:10 +0000 (17:58 +0000)]
b=7047
p=adilger
r=nic
fix typo that was preventing lconf --abort_recovery from working
nikita [Wed, 12 Oct 2005 10:59:38 +0000 (10:59 +0000)]
typo fix.
nikita [Wed, 12 Oct 2005 10:55:00 +0000 (10:55 +0000)]
Add locking to provide consistency between kms and lsm.
b=5047
r=nikita
r=adilger
nic [Mon, 10 Oct 2005 23:59:13 +0000 (23:59 +0000)]
make sure we actually build the drivers...
nic [Mon, 10 Oct 2005 22:16:12 +0000 (22:16 +0000)]
add qsnet patch for 2.6-rhel4
nic [Mon, 10 Oct 2005 20:22:44 +0000 (20:22 +0000)]
update to latest update from Suse.
nikita [Mon, 10 Oct 2005 06:52:34 +0000 (06:52 +0000)]
check returned value
nikita [Mon, 10 Oct 2005 06:36:45 +0000 (06:36 +0000)]
liblustre/tests/sanity.c: add test 51 to test for regression in
ldlm_cli_enqueue() introduced by 7311 fix.
nikita [Sun, 9 Oct 2005 20:32:40 +0000 (20:32 +0000)]
Fix wrong assertion added to ldlm_cli_enqueue() by patch from bug 7311. Also
fix few outdated references in comments.
b=7311
r=adilger
nathan [Mon, 3 Oct 2005 19:31:27 +0000 (19:31 +0000)]
b=3289
r=adilger
report recovery time remaining in /proc/.../recovery
adilger [Sat, 1 Oct 2005 06:09:53 +0000 (06:09 +0000)]
Branch b1_4
Description: if client is started with down MDS mount hangs in ptlrpc_queue_wait
Details : Having an LWI_INTR() wait event (interruptible, but no timeout)
will wait indefinitely in ptlrpc_queue_wait->l_wait_event() after
ptlrpc_import_delayed_req() because we didn't check if the
request was interrupted, and we also didn't break out of the
event loop if there was no timeout.
__l_wait_event() changes match those recently made in HEAD.
b=7184
r=devesh
nathan [Fri, 30 Sep 2005 22:40:52 +0000 (22:40 +0000)]
b=9445
r=adilger
remove mds and client cleanup logs
adilger [Fri, 30 Sep 2005 18:19:13 +0000 (18:19 +0000)]
Branch b1_4
Update build version to 1.4.5.7.
nikita [Fri, 30 Sep 2005 10:54:45 +0000 (10:54 +0000)]
reorder changelog entries: new entries go to the end of current release list
nikita [Fri, 30 Sep 2005 10:51:05 +0000 (10:51 +0000)]
remove assertion that does not buy us much, while potentially hampering interoperability
adilger [Fri, 30 Sep 2005 10:37:05 +0000 (10:37 +0000)]
Branch b1_4
Shouldn't have been committed.
b=7342
adilger [Fri, 30 Sep 2005 10:26:10 +0000 (10:26 +0000)]
Branch b1_4
Description: bind OST threads to NUMA nodes to improve performance
Details : all OST threads are uniformly bound to CPUs on a single NUMA
node and do their allocations there to localize memory access
b=7342
adilger [Fri, 30 Sep 2005 10:10:20 +0000 (10:10 +0000)]
Branch b1_4
Use actual page size instead of hard-coded 4096 bytes.
nathan [Fri, 30 Sep 2005 00:34:22 +0000 (00:34 +0000)]
Branch b1_4
fix for nonexistent modules.conf
adilger [Thu, 29 Sep 2005 23:43:28 +0000 (23:43 +0000)]
Branch b1_4
Description: lconf did not handle in-kernel recovery with LDAP properly
Details : lconf/LustreDB get_refs() is searching the wrong namespace
b=6163
adilger [Thu, 29 Sep 2005 23:22:27 +0000 (23:22 +0000)]
Branch b1_4
Remove left-over function (replaced by ldlm_glimpse_ast() in latest patch).
b=7311
adilger [Thu, 29 Sep 2005 21:29:30 +0000 (21:29 +0000)]
Branch b1_4
Description: unable to set striping with a starting offset beyond OST 160
Details : llapi_create_file() incorrectly limited the starting stripe
index to the maximum single-file stripe count.
b=9440
r=behlendo
nathan [Thu, 29 Sep 2005 15:34:38 +0000 (15:34 +0000)]
b=9428
r=adilger
at initial connect, try all failover servers quickly
nikita [Thu, 29 Sep 2005 13:20:56 +0000 (13:20 +0000)]
oops... revert chunk committed by mistake (some local debugging stuff).
nikita [Thu, 29 Sep 2005 12:58:24 +0000 (12:58 +0000)]
Latest OST-side locking with connection flags.
b=7311
r=adiler
adilger [Thu, 29 Sep 2005 06:37:35 +0000 (06:37 +0000)]
Branch b1_4
Description: server may evict liblustre clients accessing contended locks
Details : if a client is granted a lock or receives a lock completion AST
with a blocking AST pending it would not reply to the AST for
LDLM_FL_CANCEL_ON_BLOCK locks causing the server to time out on
the AST (it only cancels when sending an explicit blocking AST).
If enough such locks were processed it would cause clients to
be evicted. It now replies to such ASTs and cancels when done.
b=9352, b=7313
r=nikita, bogl
adilger [Wed, 28 Sep 2005 18:26:07 +0000 (18:26 +0000)]
Branch b1_4
Remove last vestiges of "mgmt" and "mgmt_cli", which was never used. The
lustre/mgmt directory has been an empty shell for a long time.
The new mountconfig code uses "mgc" and "mgs".
adilger [Tue, 27 Sep 2005 23:46:59 +0000 (23:46 +0000)]
Branch b1_4
Description: MDS may oops in groups_free()
Details : in rare race conditions a newly allocated group_info struct is
freed again, and this can be NULL. The 2.4 compatibility code
for groups_free() checked for a NULL pointer, but 2.6 did not.
b=7273
adilger [Tue, 27 Sep 2005 23:42:47 +0000 (23:42 +0000)]
Branch b1_4
Lustre fixes for compiling against a 2.6.12 kernel from Bull.
b=6864
adilger [Tue, 27 Sep 2005 22:16:07 +0000 (22:16 +0000)]
Branch b1_4
Lustre fixes for compiling against a 2.6.12 kernel from Bull.
b=6864
adilger [Tue, 27 Sep 2005 20:14:54 +0000 (20:14 +0000)]
Branch b1_4
Fix stripe test program to properly handle filesystems with default stripe
count = -1 (which should result in a full-OST striping).
b=9359
adilger [Tue, 27 Sep 2005 19:41:39 +0000 (19:41 +0000)]
Branch b1_4
Add liblustre_wait_event() calls before entering all liblustre API functions
to ensure that pending ASTs from LDLM_FL_CANCEL_ON_BLOCK locks are handled
before we do any local lock matching. Also add liblustre_wait_event() calls
just before exiting Lustre code to handle any remaining items before returning
to the uninterruptible client code.
b=9352, b=7313
r=green
adilger [Tue, 27 Sep 2005 00:11:34 +0000 (00:11 +0000)]
Branch b1_4
Don't try to walk directory default EA with "lfs find --obd ...".
Tested by HP.
b=9382
adilger [Mon, 26 Sep 2005 23:09:42 +0000 (23:09 +0000)]
Branch b1_4
Put LDLM_FL definitions in numerical order to avoid potential duplication
of values.
b=7313
eeb [Mon, 26 Sep 2005 08:30:15 +0000 (08:30 +0000)]
* GM zeroconf mount fixes
eeb [Mon, 26 Sep 2005 08:10:06 +0000 (08:10 +0000)]
* Backed out previous commit; it included the wrong files
eeb [Mon, 26 Sep 2005 08:01:22 +0000 (08:01 +0000)]
* Added vibnal arp patch from 8206
cliffw [Fri, 23 Sep 2005 18:18:44 +0000 (18:18 +0000)]
GM nal for Sandia
adilger [Fri, 23 Sep 2005 17:20:26 +0000 (17:20 +0000)]
Branch b1_4
Update build version for 1.4.5.5 tag
adilger [Fri, 23 Sep 2005 09:34:07 +0000 (09:34 +0000)]
Branch b1_4
Create liblustre test files with O_LARGEFILE so they can grown > 2GB.
Clean up the t23 test file.
b=9339
adilger [Fri, 23 Sep 2005 09:02:49 +0000 (09:02 +0000)]
Branch b1_4
Description: improve by-nid export eviction on the MDS and OST
Details : allow multiple exports with the same NID to be evicted at one
time without re-searching the exports list.
b=7304
r=green, tested at Sandia
adilger [Wed, 21 Sep 2005 17:45:05 +0000 (17:45 +0000)]
Branch b1_4
Update usage and docs for lfs setstripe -d.
adilger [Wed, 21 Sep 2005 07:55:37 +0000 (07:55 +0000)]
Branch b1_4
b=8322
r=nathan
Description: OST or MDS may oops in ping_evictor_main()
Details : ping_evictor_main() drops obd_dev_lock if deleting a stale
export but doesn't restart at beginning of obd_exports_timed
list afterward.
The list_for_each_safe() macro is only safe for the removal of the current
entry and not safe if some other entry (in particular the next one)
is removed. As class_fail_export() will immediately result in the export
being removed from the obd_exports_timed list (via class_unlink_export())
we are OK to restart processing at the start of the list each time.
The extra pet_lock around pet_exp references in code are not strictly
necessary, but rather precautionary and for consistency when accessing
pet_exp.
adilger [Wed, 21 Sep 2005 07:16:08 +0000 (07:16 +0000)]
Branch b1_4
Description: Creating more than 1000 files for a single job may cause a load
imbalance on the OSTs if there are also a large number of OSTs.
Details : qos_prep_create() uses an OST index reseed value that is an
even multiple of the number of available OSTs so that if the
reseed happens in the middle of the object allocation it will
still utilize the OSTs as uniformly as possible.
b=8330
r=behlendorf (tested on BG/L)
adilger [Wed, 21 Sep 2005 07:11:53 +0000 (07:11 +0000)]
Branch b1_4
Put 3min timelimit on random-read sanity test for slow systems (ala UML).
b=6252
adilger [Wed, 21 Sep 2005 06:22:46 +0000 (06:22 +0000)]
Branch b1_4
Improve error message if fsfilt_ext3_write_record fails to start a transaction.
b=8317 (debugging of)
r=phil
adilger [Tue, 20 Sep 2005 23:55:30 +0000 (23:55 +0000)]
Branch b1_4
Add definition for pgoff_t, which isn't defined for 2.4 kernels.
b=6252
nikita [Tue, 20 Sep 2005 13:24:54 +0000 (13:24 +0000)]
Land changes to the read-ahead algorithm improving its behavior for random
reads:
- always try to read-ahead at least file region that will be read by read(2)
call.
- try to detect random reads, and avoid excessive read-ahead in that case.
b=6252
r=adilger
nikita [Mon, 19 Sep 2005 14:09:16 +0000 (14:09 +0000)]
A script to run kernel builds as a benchmark.
adilger [Thu, 15 Sep 2005 19:08:53 +0000 (19:08 +0000)]
Branch b1_4
Fix liblustre sanity unaligned write test. Was previously not really testing
the 1TB write offset because of 32-bit lseek offset truncation.
b=7279
nikita [Thu, 15 Sep 2005 08:19:02 +0000 (08:19 +0000)]
use local variable
adilger [Wed, 14 Sep 2005 21:45:49 +0000 (21:45 +0000)]
Branch b1_4
Add support for F_{GET,SET}LK{,W}64 to fcntl because the t23() use of
_FILE_OFFSET_BITS=64 caused these macros to be changed in the header.
b=7279
adilger [Wed, 14 Sep 2005 21:18:37 +0000 (21:18 +0000)]
Branch b1_4
Use 64-bit variable for libsysio lseek64() internal return value.
Update liblustre sanity.c to use 64-bit IO functions where needed.
b=7279
adilger [Wed, 14 Sep 2005 20:46:52 +0000 (20:46 +0000)]
Branch b1_4
Move ChangeLog comment to end of 1.4.6 release notes.
Fix CFS_PAGE_MASK for extent start in ost_get_extent_lock().
Add LDLM_FL_CBPENDING for all granted liblustre extent locks.
Add OBD_CONNECT_SRVLOCK for OST only.
Add misc comments from final patch.
b=7311
r=nikita (original patch)
nikita [Wed, 14 Sep 2005 10:42:52 +0000 (10:42 +0000)]
version of 7311 fix, tested by Cray, with few minor modifications from newer
not-yet tested version:
- comments;
- uninitialized .l_extent in lustre_build_lock_params();
- extent lock boundaries are better to be page aligned in ost_brw_lock_get().
b=7311
r=adilger
adilger [Wed, 14 Sep 2005 08:56:29 +0000 (08:56 +0000)]
Branch b1_4
Update liblustre 2GB lseek test to also do SEEK_SET 0 afterward (per bug).
I'm unable to reproduce a problem with this test on 2.6.9.
b=7279
green [Tue, 13 Sep 2005 16:53:20 +0000 (16:53 +0000)]
Branch: b1_4
b=7313
r=adilger
Allow locks flagged in a certain way (used by liblustre) to be cancelled without
waiting for reply from client.
This is a prototype version of code. We land it so that it will get into next
code drop to Cray.
adilger [Tue, 13 Sep 2005 16:13:35 +0000 (16:13 +0000)]
Branch b1_4
Remove lower limit on ldlm_timeout value for liblustre. Tested at Sandia.
b=7201
phil [Tue, 13 Sep 2005 06:23:13 +0000 (06:23 +0000)]
b=5781
r=adilger (original patch)
Severity : minor
Frequency : rare (only HPUX clients mounting unsupported re-exported NFS vol)
Bugzilla : 5781
Description: an HPUX NFS client would get -EACCESS when ftruncate()ing a newly
created file with mode 000
Details : the Linux NFS server relies on an MDS_OPEN_OWNEROVERRIDE hack to
allow an ftruncate() as a non-root user to a file with mode 000.
Lustre now respects this flag to disable mode checks when
truncating a file owned by the user
adilger [Mon, 12 Sep 2005 22:47:07 +0000 (22:47 +0000)]
Branch b1_4
Compatibility for 2.6.12 kernel.
b=6864
adilger [Mon, 12 Sep 2005 22:20:51 +0000 (22:20 +0000)]
Branch b1_4
Don't enable full debug for liblustre testing.
adilger [Mon, 12 Sep 2005 22:13:45 +0000 (22:13 +0000)]
Branch b1_4
Always hold obd_dev_lock when manipulating obd_exports_timed or
exp_obd_chain_timed lists. Found in code inspection, but not a cause of
bug 8322 (crash in ping evictor) as liblustre is not involved.
b=8322
nikita [Mon, 12 Sep 2005 15:48:08 +0000 (15:48 +0000)]
add explicit atoll() declaration to keep gcc happy.
nikita [Sat, 10 Sep 2005 13:18:11 +0000 (13:18 +0000)]
- add command line option to random-reads to make multiple consecutive calls
to read(2);
- add random-reads to the Makefile;
- add random-reads-based test to sanity.sh (test_101) to test how read-ahead
algorithm handles seekful work-loads.
adilger [Sat, 10 Sep 2005 09:09:48 +0000 (09:09 +0000)]
Branch: b1_4
Don't allocate zero'd memory so we write non-zero'd data to disk in test 23.
b=7279
adilger [Sat, 10 Sep 2005 08:09:58 +0000 (08:09 +0000)]
Branch: b1_4
Add lseek() tests (t23) to liblustre sanity.c for bug 7279.
Test can be run without arguments if LIBLUSTRE_MOUNT_TARGET is set.
Allow running only a single test from sanity via "-o {number}" like sanity.sh.
b=7279
adilger [Sat, 10 Sep 2005 06:31:36 +0000 (06:31 +0000)]
Branch b1_4
Added debugging for bad LOV EA detection.
Quiet spurious error for generation mismatch.
r=phil (original patch)
adilger [Fri, 9 Sep 2005 20:21:07 +0000 (20:21 +0000)]
Branch b1_4
Description: doing an ls when liblustre clients are running is slow
Details : sending a glimpse AST to a liblustre client waits for every AST
to time out, as liblustre clients will not respond while they
are processing. Since they cannot cache data anyways we refresh
the OST lock LVB from disk instead.
b=7198
r=phil, green (original patch)
adilger [Fri, 9 Sep 2005 18:32:26 +0000 (18:32 +0000)]
Branch b1_4
Remove obsolete test script (this is covered by replay-single.sh anyways).
adilger [Fri, 9 Sep 2005 18:18:48 +0000 (18:18 +0000)]
Branch b1_4
Description: specifying an (invalid) directory default stripe_size of -1
would reset the directory default striping
Details : stripe_size -1 was used internally to signal directory stripe
removal, now use "all default" to signal dir stripe removal
as a directory striping of "all default" is not useful
b=7328
r=green
adilger [Fri, 9 Sep 2005 16:15:48 +0000 (16:15 +0000)]
Branch b1_4
Description: Tuning the MDC DLM LRU size to zero triggers client LASSERT
Details : llu_lookup_finish_locks() tries to set lock data on a lock
after it has been released, only do this for referenced locks.
Tested by bogl.
b=7201 (b=7350)
r=green
adilger [Thu, 8 Sep 2005 18:16:18 +0000 (18:16 +0000)]
Branch b1_4
Description: liblustre could not open files whose last component is a symlink
Details : sysio_path_walk() would incorrectly pass the open intent to
intermediate path components.
b=6363
r=oleg, lee, devesh
adilger [Thu, 8 Sep 2005 07:49:38 +0000 (07:49 +0000)]
Branch b1_4
Description: Fix for potential infinite loop processing records in an llog.
Details : If an llog record is corrupted/zeroed, it is possible to loop
forever in llog_process(). Validate the llog record length
and skip the remainder of the block on an invalid value.
b=7359
adilger [Thu, 8 Sep 2005 07:27:59 +0000 (07:27 +0000)]
Branch b1_4
Quiet some overly noisy debug messages.
phil [Wed, 7 Sep 2005 05:45:05 +0000 (05:45 +0000)]
b=8320
r=phil (HP's patch)
Severity : minor
Frequency : rare
Bugzilla : 8320
Description: lconf incorrectly determined whether two IP networks could talk
Details : In some more complicated routing and multiple-network
configurations, lconf will avoid trying to make a network
connection to a disjoint part of the IP space. It was doing the
math incorrectly for one set of cases.
adilger [Tue, 6 Sep 2005 19:52:55 +0000 (19:52 +0000)]
Branch b1_4
Add more debugging to LASSERT.
b=5359
r=phil (original patch)
adilger [Fri, 2 Sep 2005 21:13:05 +0000 (21:13 +0000)]
Branch b1_4
Remove final vestiges of "groups_upcall". Verified that lmc with:
MDSOPT="--group_upcall=$PWD/../utils/l_getgroups" sh llmount.sh
will create a .xml file with group_upcall stanza and lconf configures
this properly. Previously only tested with "lconf --group_upcall=...".
b=9259
adilger [Fri, 2 Sep 2005 16:57:34 +0000 (16:57 +0000)]
Branch b1_4
Add documentation for the supplementary group upcall in lmc docs.
Fix minor inconsistency between lmc and lconf usage.
b=9259
phil [Fri, 2 Sep 2005 15:56:08 +0000 (15:56 +0000)]
b=7278
reference the bug number in the comment that was added
adilger [Thu, 1 Sep 2005 22:49:25 +0000 (22:49 +0000)]
Branch b1_4
Remove ialloc patch from fc3 kernel series, it should be (and is) in ldiskfs.
r=nathan
adilger [Thu, 1 Sep 2005 18:09:27 +0000 (18:09 +0000)]
Branch b1_4
Description: 2.6 OST async journal commit and locking fix to improve performance
Details : The filter_direct_io()+filter_commitrw_write() journal commits for
2.6 kernels are now async as they already were in 2.4 kernels so
that they can commit concurrently with the network bulk transfer.
For block-allocated files the filter allocation semaphore is held
to avoid filesystem fragmentation during allocation. BKL lock
removed for 2.6 xattr operations where it is no longer needed.
b=7116
r=alex, tested at HP
nathan [Wed, 31 Aug 2005 23:24:27 +0000 (23:24 +0000)]
Branch b1_4
b=none
r=adilger
Add/fix error messages for failing to mount
nkj [Wed, 31 Aug 2005 21:25:46 +0000 (21:25 +0000)]
commited patch submitted to bug 5649 which checks return code from 'losetup'.
adilger [Wed, 31 Aug 2005 08:59:46 +0000 (08:59 +0000)]
Land b_release_1_4_5 onto b1_4 (20050830_1747)
jacob [Tue, 30 Aug 2005 17:48:30 +0000 (17:48 +0000)]
handle running from numbered RC scripts, and exit if no configuration is present
adilger [Mon, 29 Aug 2005 19:34:56 +0000 (19:34 +0000)]
Branch b1_4
Disable test 27 (fail LOV while using OSCs) as it is constantly failing
since we enabled failover OSTs by default.
b=7288
adilger [Fri, 26 Aug 2005 22:40:31 +0000 (22:40 +0000)]
Branch b1_4
Description: Running on many-way SMP OSTs can trigger oops in llcd_send()
Details : A race between allocating a new llcd and re-getting the llcd_lock
in llcd_grab() allowed another thread to get the newly-allocated
llcd. Re-check that the list has an llcd in it before proceeding.
Make the llcd size small enough that it fits into a single page
when we are sending/receiving it.
b=7407
nikita [Wed, 24 Aug 2005 19:27:08 +0000 (19:27 +0000)]
add new testing proglet random-reads.c to benchmark bug 6252 fix.
random-reads.c randomly reads chunks of given size from the given file. See
"random-reads -h" for (ridiculously incomplete) help.
adilger [Wed, 24 Aug 2005 17:45:43 +0000 (17:45 +0000)]
Branch b1_4
Add dump_on_timeout support for client eviction.
adilger [Wed, 24 Aug 2005 05:24:14 +0000 (05:24 +0000)]
Branch b1_4
Fix "service lustre status" on OST_only hosts.
b=7396
adilger [Mon, 22 Aug 2005 10:46:22 +0000 (10:46 +0000)]
Branch b1_4
Remove request for log files, we have some now.
b=5195
adilger [Mon, 22 Aug 2005 10:31:35 +0000 (10:31 +0000)]
Branch b1_4
Fix patch names.