Whamcloud - gitweb
amrutjoshi [Mon, 24 Mar 2003 15:52:19 +0000 (15:52 +0000)]
Fixed the patch for intent_release. After this fix lock refs are in order now.
Most of the metadata ops are working.
pschwan [Sat, 22 Mar 2003 20:18:29 +0000 (20:18 +0000)]
unlinkmany a-la createmany and statmany
rread [Fri, 21 Mar 2003 22:57:24 +0000 (22:57 +0000)]
- don't need to unregister the recovd, since it's gone
- class_destroy_import() is doing an import_put, so
I removed an extra put from class_obd_cleanup().
amrutjoshi [Fri, 21 Mar 2003 20:41:58 +0000 (20:41 +0000)]
Refreshed patche with intent_release calls
amrutjoshi [Fri, 21 Mar 2003 13:28:52 +0000 (13:28 +0000)]
Fixing IT_GETATTR intents and many small changes.
braam [Fri, 21 Mar 2003 03:59:05 +0000 (03:59 +0000)]
- MDS/client open code changes, to open directories on the MDS.
- no more slab validation on the MDS
braam [Fri, 21 Mar 2003 03:56:20 +0000 (03:56 +0000)]
These are the kernel patches for Peter's open directory fixes. Old
code can probably work with this (we'll allow it in through a small
change in obd_class.c)
adilger [Thu, 20 Mar 2003 11:02:16 +0000 (11:02 +0000)]
I believe that this will fix the remaining threaded unlink issues, although
I wasn't able to get a chance to test it...
Basically, uncomment extN-delete_thread.diff in extN/Makefile.am and give
it a whirl under dbench and/or runtests or whatever, and if it passes we
are golden.
rread [Wed, 19 Mar 2003 20:51:37 +0000 (20:51 +0000)]
- move the xml/ldap handling class from lconf into an external module
- load_ldap.sh - loads a lustre xml config into ldap
- lactive - updates failover targets with new active devices
amrutjoshi [Sat, 15 Mar 2003 16:20:08 +0000 (16:20 +0000)]
Patches for 2.5.63. This wont work until lustre is patched for 2.5.
thantry [Sat, 15 Mar 2003 02:36:06 +0000 (02:36 +0000)]
Bugzilla 895
adilger [Sat, 15 Mar 2003 01:32:50 +0000 (01:32 +0000)]
Delete thread patch. First, tried to "fake out" the VFS by twiddling bits in
the inode to keep it around after it should have been destroyed, but no dice.
Then, I tried to allocate a "mock inode" and copy over the existing inode to
that and use it only for the unlink code. Sadly, copying list_head,
semaphore, etc does not work, so you have to end up re-initializing the whole
thing anyways, and it would just break on 2.5 anyways.
Finally, I did the "right" thing - read the same inode into a new struct
inode with iget(), and then flag that inode for "real" destruction and
have the delete thread just do an iput. Very simple, very easy.[*]
I also split the orphan list handling out of the superblock lock into
its own lock, so that we don't get stuck behind the delete thread (which
holds it for long periods doing truncates) when we are trying to add new
inodes to the truncate list.
This code passes basic acceptance testing under UML, but I'm not checking
in the Makefile.am changes that activate it until I give it a shot with
dbench 20 or "rm -r directory_full_of_large_files" so on DEV. Other
people testing it is of course welcome (just add extN-delete_thread.diff
and ext3-orphan_lock.diff to the end of EXTNP).
[*] It reminds me about a story I heard once, where an engineer who had
retired, but was on retainer for his old company in case they needed
him for consulting. Sure enough, the company's complex oil refinery
was not working properly, and after the company engineers couldn't
figure out what was wrong they called the retiree for assistance.
The retiree walked around the refinery, asking questions, looking at
valves and guages, etc., until finally he asked for a hammer, gave a
pipe a swift blow, and told them to fire up the plant again. Sure
enough, all was working properly again, and the company was happy.
Until they got the invoice - $25,000. In an outrage, they called the
retiree up and asked how he could charge $25,000 for just hitting a
pipe with a hammer. In reply, the engineer said "Hitting the pipe
with the hammer was only $10, the other $24,990 was for knowing where
to hit it."
adilger [Fri, 14 Mar 2003 21:12:07 +0000 (21:12 +0000)]
Backport of bugfix from 2.5/2.4.21-pre5 which calculates the correct value
for the number of blocks to reserve for a truncate. We were asking for 8x
as many blocks as we needed, although for large files this was capped at
EXT3_MAX_TRANS_BLOCKS anyways, so not much harm done.
Conditionally applied so when it appears in our upstream kernels we will
not die (when I commit corresponding changes to Makefile.am).
shaver [Wed, 12 Mar 2003 12:51:15 +0000 (12:51 +0000)]
- b=959: add connection-switching to import recovery. (Not yet tested: one of
my MDS nodes isn't coming up right now.)
adilger [Wed, 12 Mar 2003 00:58:12 +0000 (00:58 +0000)]
Script to grab asm output and look for heavy stack (ab)users.
Could be improved a bit by having it take a list of files as args and
running objdump and then preceding each line with the filename, so we can
run it on modules more easily than having a shell script call a perl script...
nfshp [Tue, 11 Mar 2003 14:45:06 +0000 (14:45 +0000)]
add a extra param to mmap unless it always return error.
pengzhao [Tue, 11 Mar 2003 07:17:08 +0000 (07:17 +0000)]
Bug 828 is fixed by Peng Zhao, arrpoved by Andreas.
runas.c allows the root to "runas" another user to do things.
shaver [Mon, 10 Mar 2003 07:22:15 +0000 (07:22 +0000)]
Remove liblustre-build-breaking l_wait_event define, and get rid of wait_event
while I'm at it.
shaver [Sun, 9 Mar 2003 21:01:41 +0000 (21:01 +0000)]
- b=977: C-z (and other non-fatal signals) send us into an infinite loop if
we wait for recovery. We now use signal blocking to prevent signal
delivery until we're interested in some or all signals.
- b=988: umount -f hangs when MDS is dead
- b=722: Lustre kernel threads cause load average to skyrocket. (Still get
a little boost from the socknal threads, but it's much better.)
nfshp [Sun, 9 Mar 2003 07:56:20 +0000 (07:56 +0000)]
oops, fix my last checkin.
nfshp [Sun, 9 Mar 2003 07:53:03 +0000 (07:53 +0000)]
fix IS_ERR macro: return 0/NULL is sign of success.
adilger [Sat, 8 Mar 2003 15:59:30 +0000 (15:59 +0000)]
Completely untested (but compiled) port of noread-creates patch to
2.5.current. I suspect it is OK, since the ext3 code hasn't changed much,
but needs testing.
shaver [Sat, 8 Mar 2003 02:09:49 +0000 (02:09 +0000)]
- Remove the now-unnecessary connection chaining bits from imports and exports.
- In related news, always drop the exports' and imports' connection refs when
they're destroyed.
- class_connect was leaking its "caller" (compare: "handle lifetime") ref to
the new export. We're going to try without the leak for a while, see how
it goes.
- Bulk descs now have import and export pointers, so that they can play nicely
with recovery.
- Clean, informative console diagnostics for recovery cases. (Hi, Terry!)
- Do the ldlm-hook i_m_g/i_m_p based on emptiness of the conn_list, since it's
far too late by the time the __exit routines run.
- Put an ugly-but-serviceable UUID in the server's connection's remote_uuid.
- Remember that we can already have an export in req->rq_export upon entry to
target_handle_connect, if the client and server are the same node, and
therefore share a handle table.
- Get rid of class_signal_connection_failure, which was vestigial at best.
- Welcome back to the world of connection-sharing.
- Call the upcall like:
/path/to/upcall $FAILED_IMPORT_UUID $FAILED_OBD_UUID $CONN_UUID
shaver [Fri, 7 Mar 2003 20:16:05 +0000 (20:16 +0000)]
- b=958: recovery per export/import, not per connection
- no more recovd thread:
- import failure and upcall-execution is done by the failing thread
- export failure and client eviction is done by the thread that's going
to be waiting for it to complete anyway
- import reconnection runs synchronously in the context of the lctl
invocation, which means that we can now return errors to the upcall
- no more fancy recovery state machine
- no more RPCDEV service
- b=954: zero-impact OBD_PING opcode for MDS and OST
- Grand Unified Theory of Client Recovery
- mdc_recover/osc_recovery unified in ptlrpc_recover_import
rread [Tue, 4 Mar 2003 21:42:30 +0000 (21:42 +0000)]
moved procbridge.h to include/portals
rread [Tue, 4 Mar 2003 10:00:10 +0000 (10:00 +0000)]
fix make rpms, at least for me.
- new configure option:
--enable-efence: turns on -lefence support for liblustre
This option is OFF by default, so rpms will build on a machine
that doesn't have efence-devel.
- add liblustre to DIST_SUBDIRS
rread [Tue, 4 Mar 2003 00:38:11 +0000 (00:38 +0000)]
b=941: zab's fix for ia64 compiles. untested, but it does compile
shorthair [Sun, 2 Mar 2003 04:36:47 +0000 (04:36 +0000)]
one small modification for linking libtest
braam [Sat, 1 Mar 2003 22:45:07 +0000 (22:45 +0000)]
- fixes for builds of liblustre (leave out kernel include path)
- get ptlrpc working in the library
zab [Sat, 1 Mar 2003 00:32:51 +0000 (00:32 +0000)]
- push common ldlm_handle_foo_ setup paths into their caller so it can send
the reply before descending into the callbacks
- avoid doing 0 len ll_brws in writepage by unlocking pages outside i_size
braam [Fri, 28 Feb 2003 18:43:05 +0000 (18:43 +0000)]
- minor build fixes for liblustre
adilger [Thu, 27 Feb 2003 23:44:52 +0000 (23:44 +0000)]
Add patch for handling largefile growth bug.
Remove bogus warning case from Makefile for patch application, we already
do a bunch of conditional patch application stuff so use that.
braam [Wed, 26 Feb 2003 23:35:57 +0000 (23:35 +0000)]
- miniature buildfixes for liblustre.
shaver [Tue, 25 Feb 2003 17:55:17 +0000 (17:55 +0000)]
Fixes to recovery-cleanup.sh. We clean up pretty well, but the interrupted-open
case trips an assertion (bug 912).
shaver [Tue, 25 Feb 2003 16:40:50 +0000 (16:40 +0000)]
Quick test for client cleanup while in recovery.
zab [Tue, 25 Feb 2003 01:08:41 +0000 (01:08 +0000)]
- bring b_devel changes into b_io in preparation for file size fixes
shaver [Mon, 24 Feb 2003 21:48:20 +0000 (21:48 +0000)]
- store the FID for an open file in the request, so that replay can open by
FID, in case the file has been deleted, etc.
- reconstruct replies for close (untested), link (tested), unlink (tested, but
will leak OST objects until open-unlink works).
- quickie test program for link(2)
shorthair [Sat, 22 Feb 2003 15:07:55 +0000 (15:07 +0000)]
some fixes for compilation with linux2.5.59
adilger [Sat, 22 Feb 2003 08:53:08 +0000 (08:53 +0000)]
Add return value to init_timer() to quiet compiler warning.
braam [Sat, 22 Feb 2003 03:15:57 +0000 (03:15 +0000)]
- to keep things tidy on the screen
rread [Fri, 21 Feb 2003 18:24:42 +0000 (18:24 +0000)]
fill in some sample code in libtest.c to demonstrate how to use
the ioctl dumps created by lctl.
adilger [Fri, 21 Feb 2003 11:44:56 +0000 (11:44 +0000)]
Add a new test which compiles portals and lustre in a lustre mount.
braam [Fri, 21 Feb 2003 11:05:45 +0000 (11:05 +0000)]
- another day of good progress with liblustre. We now have all the
modules initialized, ready to start playing with lctl dumpfiles.
rread [Thu, 20 Feb 2003 20:42:34 +0000 (20:42 +0000)]
Landing b_malt onto b_devel
- move uuid and handles to lustre, with config changes
- b=204,667: fix router config so network interfaces can be
created in any order.
- Make sure all the ioctl calls in obd.c are packed.
rread [Thu, 20 Feb 2003 09:42:19 +0000 (09:42 +0000)]
- fix the uuid ioctls to work on the lustre side
- add IOC_PACK() to all the ioctls in lctl
- fix a format string in super.c
- rename IOCINIT to IOC_INIT, to match other macros
- looks ready to land on b_devel
rread [Thu, 20 Feb 2003 07:25:24 +0000 (07:25 +0000)]
update b_malt
- includes multinet diffs, which will are needed on malt
rread [Thu, 20 Feb 2003 06:52:56 +0000 (06:52 +0000)]
- completing the move of UUID registration from portals to lustre.
Need to merge with multinet to finish this.
shaver [Wed, 19 Feb 2003 22:01:25 +0000 (22:01 +0000)]
Untested implementation of reconstruct_getattr_name.
Test program for single invocation of GETATTR_NAME.
thantry [Wed, 19 Feb 2003 19:58:03 +0000 (19:58 +0000)]
Deleting the dummy file.
thantry [Wed, 19 Feb 2003 19:53:21 +0000 (19:53 +0000)]
This is a test file. We have been moved to behind a firewall, and we want
to ensure that we can perform regular CVS operations. Please excuse.
braam [Wed, 19 Feb 2003 04:30:18 +0000 (04:30 +0000)]
new files with api's we'd initially put in portals
braam [Wed, 19 Feb 2003 04:29:19 +0000 (04:29 +0000)]
- add new files move over from portals
pschwan [Tue, 18 Feb 2003 05:45:11 +0000 (05:45 +0000)]
a-m-double for 2 mounts
braam [Mon, 17 Feb 2003 10:23:25 +0000 (10:23 +0000)]
- the bulk of the build fixes for liblustre. Note
that all user level programs now need to include
<liblustre.h> (as is done in the utils and test
directories.
zab [Sat, 15 Feb 2003 21:45:46 +0000 (21:45 +0000)]
bring b_io up to the lastest write caching code. fsx and rundbench 1 pass in a
96M all-in-one UML.
- prepare_write is throttled by finding dirty pages on the super block's
dirty inodes and writing them to the network
- commit_write marks the page dirty and updates i_size
- writepage blocks writing the page and other dirty pages to the network
- sort the pages within a obd_brw batch so that block allocation isn't hosed
on the OST
- don't change s_dirty's position on the list during writeback, that seems to
be the job of writepage's callers
- don't try and mess with page's list membership after obd_brw completes,
filemap_fdata{sync,wait} take care of that
- put a hack in obdo_to_inode that tricks ll_file_size into preferring the
local i_size when there are cached pages on the inode
- add license blurb and editor instructions
- get rid of the management of vm lru pages and liod thread, prepare_write
throttling and kupdate serve the same task (hopefully)
- remove unused ll_flush_inode_pages
- throw in a OSC-side "count > 0" assert to match a similar OST assert that
I couldn't reproduce
- writeback will try to batch PTL_MD_MAX_IOV pages on the wire
zab [Sat, 15 Feb 2003 20:21:59 +0000 (20:21 +0000)]
- rebase b_io against HEAD in preparation for the latest write cache code
adilger [Tue, 11 Feb 2003 23:43:27 +0000 (23:43 +0000)]
Add patch (already in 2.4.21-pre and 2.5.current) to quiet extN ino_t warnings.
braam [Sun, 9 Feb 2003 09:11:00 +0000 (09:11 +0000)]
- add a fix to execute binaries to the kernel code (sorry folks, new kernel!)
ONLY did rh-2.4.18-18
- add test 30 to sanity.sh to test the same
- check in pc and series file for 2.5
pschwan [Fri, 7 Feb 2003 16:15:15 +0000 (16:15 +0000)]
Fix sys_link in vfs_intent_hp.patch to match vfs_intent-2.4.18.patch
eeb [Fri, 7 Feb 2003 15:59:15 +0000 (15:59 +0000)]
* Added a barrier test program in utils 'obdbarrier'
* Changed lock cleanup code in echo_client to use obd_cancel_unused(),
since the LOV's lock callbacks happen for each stripe, passing the
stripe locks rather than the single lock who's handle was returned by
obd_enqueue()
* moved obdio/obdbarrier common procedures into obdiolib.c
adilger [Fri, 7 Feb 2003 09:48:29 +0000 (09:48 +0000)]
Remove CONFIG_DEV_RDONLY so Mike doesn't lose his sanity.
adilger [Thu, 6 Feb 2003 23:25:55 +0000 (23:25 +0000)]
Update for l10 kernel version.
adilger [Wed, 5 Feb 2003 22:37:18 +0000 (22:37 +0000)]
Fixups to sanity.sh and createtest.c so it passes when run as a non-root
user (to keep Terry happy ;-).
We're not doing extensive permission testing here - leave that to POSIX.
adilger [Wed, 5 Feb 2003 21:55:30 +0000 (21:55 +0000)]
Fix for bug 695, and a regression test to go with it (get rid of hard-coded
pathnames in sanity.sh while I'm at it ;-).
These are only client-side checks (the MDS does it's own checking for
both mds_open() and mds_reint_create() so that we don't create a bogus
file type).
adilger [Wed, 5 Feb 2003 21:45:43 +0000 (21:45 +0000)]
Regression test for bug 695.
eeb [Wed, 5 Feb 2003 17:45:18 +0000 (17:45 +0000)]
* Added lock enqueue/cancel ioctl interface to echo_client, with proper
teardown on (possibly unexpected) client exit.
* added '-l' flag to utils/obdio, so that it locks the extent it writes,
then reads back.
* took 'filter' prefix off /proc stats interface in filter and copied it
into the echo OBD; obdstats takes obd type name parameter as well as
optional repeat interval.
adilger [Mon, 3 Feb 2003 19:21:17 +0000 (19:21 +0000)]
Fix typo for patch series.
adilger [Mon, 3 Feb 2003 18:31:10 +0000 (18:31 +0000)]
Update kernel patches for hp kernel.
shaver [Mon, 3 Feb 2003 18:25:50 +0000 (18:25 +0000)]
Fix recovery underpinnings:
- Store the client's UUID (from last_rcvd) in exp_client_uuid, so
target_handle_connect can match it up.
- Don't obd_connect if target_handle_reconnect matched us up with an existing
export.
- Don't invalidate everything so eagerly: only if we can't recover and it
wasn't a partition-reconnect.
- Clear the in-recovery flag before we send clients their delayed replies.
- Add "force" argument to lctl cleanup, and remove it from detach.
- Use new lmc API for recovery-small.sh (more test cleanup coming).
braam [Sat, 1 Feb 2003 21:58:03 +0000 (21:58 +0000)]
- temporary debugging stuff - this will slow you down.
adilger [Sat, 1 Feb 2003 07:42:42 +0000 (07:42 +0000)]
Add invalidate-show patch to kernel.
Move docs to a more noticable place.
pschwan [Fri, 31 Jan 2003 21:37:21 +0000 (21:37 +0000)]
Merge b_intent into b_md:
- New kernel patch (version 9)
- DLM hooks to revalidate locked data, once the lock is granted (604)
- Further MDS reorganization, particularly of the open and o_creat paths
pschwan [Fri, 31 Jan 2003 21:15:10 +0000 (21:15 +0000)]
- merge b_md into b_intent
- rename invalidate-show.diff, at andreas's request
shaver [Fri, 31 Jan 2003 19:06:33 +0000 (19:06 +0000)]
- Clean up, but don't error out if some parts fail.
- ONLY for easier controlled-reproduction.
- Error codes count.
braam [Thu, 30 Jan 2003 22:27:04 +0000 (22:27 +0000)]
- error handling fixes from Andreas
pschwan [Thu, 30 Jan 2003 21:09:09 +0000 (21:09 +0000)]
an open(O_LOV_CREATE_DELAY) test
pschwan [Thu, 30 Jan 2003 20:27:07 +0000 (20:27 +0000)]
- return 0, not ENOENT, from a failed lookup in mds_open
braam [Thu, 30 Jan 2003 19:40:16 +0000 (19:40 +0000)]
- fill inodes if files exist
shaver [Thu, 30 Jan 2003 19:24:58 +0000 (19:24 +0000)]
New recovery regression -- hard-coded node names, probably other badness,
but it's finding bugs now, so I'm checking it in.
braam [Thu, 30 Jan 2003 07:18:38 +0000 (07:18 +0000)]
- more error handling
braam [Thu, 30 Jan 2003 05:17:01 +0000 (05:17 +0000)]
- add iod rmap patches to b_intent
braam [Thu, 30 Jan 2003 05:07:54 +0000 (05:07 +0000)]
- fix typo
adilger [Wed, 29 Jan 2003 22:13:06 +0000 (22:13 +0000)]
Add patch to be more verbose about which inodes are busy at unmount time
(for "VFS: Busy inodes" message). Not added to any patchsets yet (I leave
it up to Peter to decide if he wants it applied to our kernels by default).
eeb [Wed, 29 Jan 2003 21:17:33 +0000 (21:17 +0000)]
* added /proc stats to obdfilter; should be general purpose and in lprocfs
* added utils/obdstat for monitoring stats
* added utils/obdio.c for exercising servers via echo_client.
* echo_client does auto close when user proc exits
braam [Wed, 29 Jan 2003 19:51:34 +0000 (19:51 +0000)]
- fix for O_EXCL case in mds_open
- updates to cray plan
- remove unnecessary error handling from
mdc_completion_callback
pschwan [Wed, 29 Jan 2003 15:47:52 +0000 (15:47 +0000)]
- don't call mds_pack_md for non-regular files in mds_validate_dentry (doh)
- add an mcreate/open test to sanity
b_intent gets back to sanity #27 again
braam [Wed, 29 Jan 2003 06:39:51 +0000 (06:39 +0000)]
- fill in ea in mds open when file exists.
adilger [Wed, 29 Jan 2003 01:08:24 +0000 (01:08 +0000)]
Add the sync-on-unmount fix, and another fix which avoids accessing freed
inodes upon ENOSPC or other errors (from AKPM).
The Makefile should be smart enough to detect if they need to be applied.
adilger [Tue, 28 Jan 2003 06:35:12 +0000 (06:35 +0000)]
Updates to vfs_intent_hp.patch for PNNL (bug 727).
This may or may not work, as I haven't had a chance to test, but it is 95%.
braam [Tue, 28 Jan 2003 05:26:51 +0000 (05:26 +0000)]
- don't clobber flags
braam [Tue, 28 Jan 2003 05:02:22 +0000 (05:02 +0000)]
- lib lustre planning document
- mds server document
- return detailed status in body->flags
pschwan [Tue, 28 Jan 2003 04:17:24 +0000 (04:17 +0000)]
don't dereference NULL or freed mfd in mds_open
pschwan [Tue, 28 Jan 2003 04:11:16 +0000 (04:11 +0000)]
- do an mntget() before dentry_open()
pschwan [Mon, 27 Jan 2003 22:50:29 +0000 (22:50 +0000)]
- remove unused mfd_clienthandle
- fix double semaphore up() in mds_open
pschwan [Mon, 27 Jan 2003 21:24:48 +0000 (21:24 +0000)]
- added ldlm_lock_decref_and_cancel, to eliminate _that_ particular race
- don't send an extra buffer to the MDS if the EA length is zero
- andreas's fix to initialize rc to 0 in mds_open
- removed some tabvs
- in mds_reint_setattr, don't set the MD if the client didn't send one
- in mds_reint_create, ericm's fix for the POSIX nametoolong bug
- fix mds_open prototype in mds_reint.c
braam [Fri, 24 Jan 2003 19:22:23 +0000 (19:22 +0000)]
- probable fix for mds refcount problem. touch /mnt/lustre/f cleans up now.
braam [Thu, 23 Jan 2003 22:04:25 +0000 (22:04 +0000)]
another bug fix.
braam [Thu, 23 Jan 2003 22:00:48 +0000 (22:00 +0000)]
- some bug fixes
braam [Thu, 23 Jan 2003 18:39:20 +0000 (18:39 +0000)]
- new file
adilger [Wed, 22 Jan 2003 13:47:16 +0000 (13:47 +0000)]
Fix one cause of bug 430 (hardlinks) that showed up with simul, but this was
a recently introduced bug and not the real source of problems. Back to the
drawing board in terms of finding a repeatable testcase. The only thing
learned from this and the previous extN orphan assertion (bug 670) is that
they were both caused by dentry refcount problems, and the files related
to that dentry refcount were unlinked.