Whamcloud - gitweb
adilger [Tue, 16 Jul 2002 17:20:41 +0000 (17:20 +0000)]
Fix cleanup in llite/super.c
rread [Tue, 16 Jul 2002 16:32:01 +0000 (16:32 +0000)]
- improve error handling for command line use
adilger [Tue, 16 Jul 2002 11:52:07 +0000 (11:52 +0000)]
Fix problem with multiple connections to the same local device.
The problem was that we stored the remote export addr+cookie in the
device array. When we disconnected any connection, it would drop the
remote export from the most recent connection.
What we do now is store the remote export addr+cookie in the local
export/import struct and always use that (the old mdc_connh and osc_connh
are gone). We could do some further cleanup of the API at this point,
but I will leave it until there is some discussion on the matter. I was
able to successfully do multiple simultaneous connect and disconnect steps,
as well as run multi-thread test_getattrs with the change. I'm just doing
a full runregression-net test.
I'm still not 100% confident that this is what Peter wants, so I tagged
the pre-commit tree with "before_rconn" to make it easy to generate
a diff (I will also tag the post-commit with "after_rconn").
rread [Tue, 16 Jul 2002 01:53:37 +0000 (01:53 +0000)]
- remove obdfs/Makefile
- get kernel version from version.h, fix uml kversion bug
rread [Tue, 16 Jul 2002 01:52:16 +0000 (01:52 +0000)]
- configuring an obd seems to work
rread [Tue, 16 Jul 2002 01:49:31 +0000 (01:49 +0000)]
- enable command line flags for lctl
- device <name> now works
pschwan [Mon, 15 Jul 2002 20:55:57 +0000 (20:55 +0000)]
Fixed brw_finish to kunmap the pointers in the bulk descriptor instead of the
callback data. The cb_data had a pointer to stack data which might not be
valid anymore, and instead of allocating a new array to hold it, we can just
walk the bulk pages.
pschwan [Mon, 15 Jul 2002 20:32:19 +0000 (20:32 +0000)]
The ioctl handler for test_brw was corrupting kernel memory and giving an
empty array to osc_brw_read, which cheerfully tried to kmap(0). Fixed.
rread [Mon, 15 Jul 2002 19:23:33 +0000 (19:23 +0000)]
- change id to ref
- change ref attributes to just uuidref
- changing service_ref to more specific references where appropriate.
pschwan [Mon, 15 Jul 2002 18:04:24 +0000 (18:04 +0000)]
- Fix for the 16kb page directory handling (thanks Andreas, it appears to
work perfectly)
- Fix for Eric's page_array compilation error
- Increase event queue sizes to 1024
- Re-enable bits of the runfailure-ost test
braam [Mon, 15 Jul 2002 15:40:45 +0000 (15:40 +0000)]
- add Andreas's suggestion for the LLNL Oops's to the patch
braam [Sun, 14 Jul 2002 17:27:12 +0000 (17:27 +0000)]
- see previous message
braam [Sun, 14 Jul 2002 17:26:49 +0000 (17:26 +0000)]
- lookup2: drop a lock in an unlikely error case; clarify interrupt
handling in lookup2
- osc_request.c: bring connection level to full after connect
- mds_reint.c: in unlink, don't cancel locks that you don't have for
negative dentries (bug fix).
braam [Sun, 14 Jul 2002 00:13:54 +0000 (00:13 +0000)]
This commit contains probably 92% of the striping infrastructure
we need initially. The most pervasive change is the introduction
of "lov_stripe_md" throughout the code.
In addtion several small little bugs were nailed in the locking --
more are outstanding.
The setup scripts are not yet capable of running this code.
Kernel patches were updated to include LOOKUP (to let runtests.sh
work).
pschwan [Sat, 13 Jul 2002 23:42:21 +0000 (23:42 +0000)]
Don't load /etc/lustre/lustre.cfg if config files were passed on the command
line.
shaver [Sat, 13 Jul 2002 23:38:45 +0000 (23:38 +0000)]
- Use refcounting to control lifetime of bulk descriptors.
- Add some LDLM timeout #warnings for future reference.
- Make all bulk-desc cleanup happen asynchronously, for cleaner interrupt (and
later timeout) handling.
- Have OSC wait for brw_write reply before beginning bulk operations, for
symmetry with brw_read and general niceness.
- Fix reply-for-freed-request check to use proper XID width.
- Use l_killable_pending in ptlrpc_check_bulk_sent, for consistency. (Also in the
new ptlrpc_check_bulk_received sibling.)
pschwan [Sat, 13 Jul 2002 19:10:22 +0000 (19:10 +0000)]
osc_connect needs to go with level LUSTRE_CONN_NEW. Fixed.
pschwan [Sat, 13 Jul 2002 19:03:23 +0000 (19:03 +0000)]
- Added match_or_enqueue helper function
- fixed userspace build problem in lustre_lib.h
- added intent-based lookup code
- fixed intent-based setattr
- added mds_fid2locked_dentry and mds_name2locked_dentry helpers
- don't crash in ptlrpc_reply, just warn of API violation
- update create.pl to open instead of mcreate
pschwan [Sat, 13 Jul 2002 18:51:02 +0000 (18:51 +0000)]
Fixed warning introduced yesterday
adilger [Fri, 12 Jul 2002 20:44:28 +0000 (20:44 +0000)]
Don't try to auto-load obd module when we are just in the process of
registering it.
adilger [Fri, 12 Jul 2002 20:43:44 +0000 (20:43 +0000)]
Use unique OST fault-injection code instead of duplicate MDS code.
adilger [Fri, 12 Jul 2002 20:38:06 +0000 (20:38 +0000)]
list_mods() should not return an error if we are not waiting.
pschwan [Fri, 12 Jul 2002 16:48:13 +0000 (16:48 +0000)]
- Added some temporary LDLM_LOCK_PUT/LDLM_LOCK_GET macros to aid in debugging
- Fixed a couple more places where we dereference 'lock' after ldlm_lock_put()
- Fixed a showstopper in ldlm_handle2lock that was causing serious problems.
- The DLM in UML with 3 mountpoints now survives pretty much any load that
create.pl can throw at it
pschwan [Fri, 12 Jul 2002 16:23:03 +0000 (16:23 +0000)]
- Fixed up some OST handler functions to always return a nonzero error when
they don't set request->rq_repmsg (this should fix Eric's RPC crash)
rread [Fri, 12 Jul 2002 09:21:30 +0000 (09:21 +0000)]
- add lustre initscript
rread [Fri, 12 Jul 2002 08:53:35 +0000 (08:53 +0000)]
- change version of HEAD to 0.5
rread [Fri, 12 Jul 2002 08:53:03 +0000 (08:53 +0000)]
- fix typos
rread [Fri, 12 Jul 2002 07:01:47 +0000 (07:01 +0000)]
- add module aliases to /modules.conf
rread [Fri, 12 Jul 2002 07:01:01 +0000 (07:01 +0000)]
- fix rpm build
rread [Fri, 12 Jul 2002 06:59:55 +0000 (06:59 +0000)]
- fixes from santa fe
rread [Fri, 12 Jul 2002 03:02:27 +0000 (03:02 +0000)]
- run config in production
adilger [Thu, 11 Jul 2002 22:55:00 +0000 (22:55 +0000)]
Remove hard-coded mountpoint from runtests script.
adilger [Thu, 11 Jul 2002 22:53:34 +0000 (22:53 +0000)]
Fix statfs for new ptlrpc interface in head.
adilger [Thu, 11 Jul 2002 22:52:24 +0000 (22:52 +0000)]
Remove sgid #warning. This is already done in the filesystem and the MDS.
It does not really make sense to do it in the MDC.
pschwan [Thu, 11 Jul 2002 22:43:00 +0000 (22:43 +0000)]
I broke llcleanup earlier. This fixes it.
pschwan [Thu, 11 Jul 2002 15:37:29 +0000 (15:37 +0000)]
- The server side of the DLM wasn't always handling the invalid handle case
very gracefully. Since there are races that make invalid handles not hugely
uncommon, that's fixed now.
- Even on the client side, we could get cancelled between ldlm_lock_decref() and
ldlm_lock_cancel(), thus passing an invalid handle into cancel(). This is
handled now.
- Fixed another leaked request.
- Fixed an IT_SETATTR bug, even though we don't currently exercise that code
path.
- Fixed common.sh for the multiple mount cleanup case
- llcleanup.sh was unloading things in the wrong order (MDS is pinned by LDLM
and needs to be unloaded later in the process). Fixed.
shaver [Wed, 10 Jul 2002 20:31:23 +0000 (20:31 +0000)]
mdc_connect needs to use a level of LUSTRE_CONN_NEW.
(I broke mounting on the trunk before, this fixes it.)
shaver [Mon, 8 Jul 2002 19:23:02 +0000 (19:23 +0000)]
Default to an rq_level of LUSTRE_CONN_FULL.
Use l_killable_pending in place of explicit lists of tests.
rread [Sat, 6 Jul 2002 23:00:46 +0000 (23:00 +0000)]
- updates for rpm
adilger [Fri, 5 Jul 2002 23:26:23 +0000 (23:26 +0000)]
Change TCP port number.
adilger [Fri, 5 Jul 2002 23:21:40 +0000 (23:21 +0000)]
Change TCP acceptor port number to be 2432, an unused Coda port. You need
to update all of your <myname>.cfg files (or XML if you use it) to use the
new port number if you want to use tcpdump, and to maintain consistency
across scripts.
uid34591 [Fri, 5 Jul 2002 22:50:42 +0000 (22:50 +0000)]
Move the on-wire stucts/definitions into lustre_idl.h, for the CVS head.
uid34591 [Fri, 5 Jul 2002 22:39:03 +0000 (22:39 +0000)]
Fix "tags" target if portals is a symlink
uid34591 [Fri, 5 Jul 2002 21:53:27 +0000 (21:53 +0000)]
Include headers explicitly that were previously included from lustre_net.h.
uid34591 [Fri, 5 Jul 2002 21:23:03 +0000 (21:23 +0000)]
Fix osc Makefile to link ll_pack.c (module osc.o should not depend on ost.o).
Remove old cruft.
uid23919 [Fri, 5 Jul 2002 21:20:11 +0000 (21:20 +0000)]
- Minor connection change to pass uuid's in.
pschwan [Fri, 5 Jul 2002 20:55:57 +0000 (20:55 +0000)]
Removed the deprecated forced-localhost DLM bits; they're confusing the
portals routing code and making me sad.
braam [Fri, 5 Jul 2002 20:53:05 +0000 (20:53 +0000)]
- use UUIDs for mounting
rread [Fri, 5 Jul 2002 20:05:27 +0000 (20:05 +0000)]
- change PACKAGE to lustre
- update spec
braam [Fri, 5 Jul 2002 19:23:33 +0000 (19:23 +0000)]
- fix documentation updates. The master document is now called
lustre.pdf. Still need the CVS Tag or HEAD-version if not
present....
- Minor edits to references etc to get compilation going.
pschwan [Fri, 5 Jul 2002 18:56:59 +0000 (18:56 +0000)]
Items of note:
* Fixes rename
* Allows for symlinks larger than LL_INLINESZ (of course, intent-based symlink
code isn't written yet)
Items of lesser note, although their importance is not diminished by the mere
inclusion in this seemingly unimportant list:
- We were ignoring reint errors in ldlm_intent_policy. No more!
- ll_dir_readpage error "handling" led directly to a panic. Hopefully that's
fixed now.
braam [Fri, 5 Jul 2002 18:48:01 +0000 (18:48 +0000)]
- Make now correctly rebuilds
- The version was given an "author tag" to make it visible
shaver [Fri, 5 Jul 2002 18:03:59 +0000 (18:03 +0000)]
Prevent C-c and C-z from locking us up, and make most of our waits
uninterruptible. We'll move to a more robust system shortly, but
this will make for a usable testing environment in the interim.
adilger [Fri, 5 Jul 2002 10:38:40 +0000 (10:38 +0000)]
Change module data to use "info@clusterfs.com" email per Peter.
adilger [Fri, 5 Jul 2002 10:17:21 +0000 (10:17 +0000)]
Add statfs fixups to head.
adilger [Fri, 5 Jul 2002 08:02:12 +0000 (08:02 +0000)]
Add closing quote so that obdecho compiles.
adilger [Fri, 5 Jul 2002 07:41:44 +0000 (07:41 +0000)]
Add statfs support for proper blocks count to head.
Includes fix for type == 0 (truncate) problem.
Removes ost_get_info call - it is not used and causes confusion.
adilger [Fri, 5 Jul 2002 07:37:46 +0000 (07:37 +0000)]
Pass 64-bit object numbers wherever possible, instead of truncating to 32-bit.
shaver [Fri, 5 Jul 2002 01:35:04 +0000 (01:35 +0000)]
tidy runfailure-ost a touch
rread [Fri, 5 Jul 2002 01:17:41 +0000 (01:17 +0000)]
think i got it this time
rread [Fri, 5 Jul 2002 01:15:37 +0000 (01:15 +0000)]
-again
rread [Fri, 5 Jul 2002 01:15:10 +0000 (01:15 +0000)]
- RCS version experimet
braam [Thu, 4 Jul 2002 23:26:32 +0000 (23:26 +0000)]
- fixes an extra intent_release in open_namei
rread [Thu, 4 Jul 2002 22:26:55 +0000 (22:26 +0000)]
- install setup scripts and config
rread [Thu, 4 Jul 2002 21:34:05 +0000 (21:34 +0000)]
- install to /usr/sbin
rread [Thu, 4 Jul 2002 21:23:19 +0000 (21:23 +0000)]
- updated to new dtd
- network now in <node>
- new service_id added to profile for network
- this will probably not work with obdctl anymore, new tools on the way
pschwan [Thu, 4 Jul 2002 19:58:38 +0000 (19:58 +0000)]
- Quiet the FIXMEs from CERRORs to CDEBUGs, because they were reducing LLNL
consoles to a crawl
- more DLM debug infrastructure, mostly preventative.
- check for the forgotten ptlrpc_abort() case in reply_in_callback
- cleanup the intent-test and runfailure-mds
shaver [Thu, 4 Jul 2002 14:45:38 +0000 (14:45 +0000)]
Dump the lock before LBUG()ing in ldlm_lock_destroy, for easier debugging of
mismatched refcount issues.
rread [Thu, 4 Jul 2002 01:54:01 +0000 (01:54 +0000)]
- add copyright, license, and attribution
rread [Thu, 4 Jul 2002 01:37:09 +0000 (01:37 +0000)]
- fix lookup bug
rread [Thu, 4 Jul 2002 01:00:50 +0000 (01:00 +0000)]
initial version
brian [Thu, 4 Jul 2002 00:12:31 +0000 (00:12 +0000)]
- Added lctl.c which is basically ptlctl/obdctl/debugctl all
frankensteined together. It should be possible to configure
all of portals/lustre through it.
- network.c contains the portals bits
device.c contains the device bits
debug.c contains the debug bits
- Some functionality has been consolidated when it was redundant,
there's still more of that to do!
- The thread handling that was in obdctl is broken for a little bit.
- All the original *ctl tools are uneffected.
- Several commands have been slightly renamed do to name collisions,
for instance the obdctl list command is now device_list, so as
not to conflict with debug_list or route_list. It'll be pretty
obvious.
behlendo [Thu, 4 Jul 2002 00:12:30 +0000 (00:12 +0000)]
- Added lctl.c which is basically ptlctl/obdctl/debugctl all
frankensteined together. It should be possible to configure
all of portals/lustre through it.
- network.c contains the portals bits
device.c contains the device bits
debug.c contains the debug bits
- Some functionality has been consolidated when it was redundant,
there's still more of that to do!
- The thread handling that was in obdctl is broken for a little bit.
- All the original *ctl tools are uneffected.
- Several commands have been slightly renamed do to name collisions,
for instance the obdctl list command is now device_list, so as
not to conflict with debug_list or route_list. It'll be pretty
obvious.
braam [Wed, 3 Jul 2002 23:19:32 +0000 (23:19 +0000)]
- first additions for rename.
- NOTE: it appears the intent regression is broken at present (freeing
a ref'd lock)
adilger [Wed, 3 Jul 2002 20:46:02 +0000 (20:46 +0000)]
Fix ordering of module unload in head.
braam [Wed, 3 Jul 2002 18:44:15 +0000 (18:44 +0000)]
clean up old patches
braam [Wed, 3 Jul 2002 18:41:36 +0000 (18:41 +0000)]
patch for Jim's 2.4.18-chaos5
braam [Wed, 3 Jul 2002 18:39:04 +0000 (18:39 +0000)]
Add note for 0.4.2
braam [Wed, 3 Jul 2002 18:29:20 +0000 (18:29 +0000)]
- update the patch with yesterday's fix
braam [Tue, 2 Jul 2002 23:17:36 +0000 (23:17 +0000)]
Minor fixes to get setup to work again. Fix some brainos in obdclass to
provide robustness.
rread [Tue, 2 Jul 2002 20:27:25 +0000 (20:27 +0000)]
- initial version
braam [Tue, 2 Jul 2002 08:41:34 +0000 (08:41 +0000)]
- add missing unlocks.
rread [Tue, 2 Jul 2002 02:41:17 +0000 (02:41 +0000)]
- create device
- make note of need to modify /etc/modules.conf
rread [Tue, 2 Jul 2002 01:36:41 +0000 (01:36 +0000)]
- experiment: attempt to add the cvs version to formatted documentation.
Added a rule to create 'lustre.lyx' which is a copy of master.lyx,
modified to include the cvs version.
Normally, "make lustre.pdf" should produce a pdf with the proper version on
the cover page, but on my system the whole Date tag is not showing up. Not sure why.
rread [Tue, 2 Jul 2002 01:32:42 +0000 (01:32 +0000)]
- remove unneeded parameter to obd_create/destroy
behlendo [Mon, 1 Jul 2002 23:45:32 +0000 (23:45 +0000)]
-Minor fixes to ensure the DTD is valid. net-local.xml passes validation
with the provided validation tool.
>xmllint --valid --noout net-local.xml
braam [Mon, 1 Jul 2002 23:24:26 +0000 (23:24 +0000)]
- other half of the buffer fix.
braam [Mon, 1 Jul 2002 22:55:42 +0000 (22:55 +0000)]
- fix for variable size buffer handling.
pschwan [Mon, 1 Jul 2002 21:23:34 +0000 (21:23 +0000)]
- more LDLM refcount locking infrastructure
- fixed the error handling paths of unlink and rmdir, so that unlinking a
non-existant file returns an error
- fixed the DLM bug that caused the perl script to hang (there's a new bug now
that it triggers under heavier load that causes it to crash)
- fixed a second crashing bug that occurred if a blocked lock gets completed
and then cancelled all before the server-side enqueue finishes (amazing but
true)
- osc_open() now returns the actual rc instead of 0
- fixed a race condition in ptlrpc/service.c related to thread shutdown (nice
catch, Mike)
shaver [Mon, 1 Jul 2002 19:34:20 +0000 (19:34 +0000)]
Fixed some path-slash badness in common.sh, and added embryonic runfailure-ost.
pschwan [Mon, 1 Jul 2002 15:26:52 +0000 (15:26 +0000)]
A Perl script to hammer concurrently at multiple mount points
braam [Mon, 1 Jul 2002 07:30:40 +0000 (07:30 +0000)]
- much of the striping configuration managment and setup builds up
this patch
- the mds has a new lovconfig command to tell it the UUID's and
default striping pattern of the targets it needs to use (these are
the UUID's of the OSC's typically).
- To make this scalable I changed some of the memory management in the
class ioctl handling
- The LOV device has a trivial attach method and setup only tells it
what MDC to use to get its information, by giving it the MDC-UUID.
- As discussed before, the MDS really provides the persistent storage
for the LOV, what little it needs. So during the obd_connect call
for the object storage target (which is made from read_super) the
storage target learns how it is striped and then connects to all the
targets.
- We are in need of better configuration scripts for this stuff and
tomorrow we will push the XML configurations a little further.
- Updated the documentation
- Began to cleanup the /proc/lustre/ stuff -- have some neat ideas
about that and SNMP now.
pschwan [Mon, 1 Jul 2002 06:35:51 +0000 (06:35 +0000)]
- updated LDLM_DEBUG to give more refcount info
- made fixme a macro, so that it shows us where it's called from
- fixed a DLM deadlock (unbalanced l_lock)
- fixed the refcount bug in ldlm_lock_decref
- fixed the refcount bug in ldlm_cli_enqueue in the failed/aborted case
- the lock slab cleans up now!
- fixed the ``connection foo has refcount -61'' bug
- found, but have not yet fixed, a subtle ctrl-c-during-aborted-ldlm-enqueue
bug that can be triggered if you abort the hanging Perl test at _just_ the right
time.
- fixed request leaks in osc_connect and mdc_connect
- we create an import in osc_connect but never use or free it -- I didn't
remove this code, assuming it was going to be used soon?
braam [Sun, 30 Jun 2002 20:34:25 +0000 (20:34 +0000)]
- same for mdc
braam [Sun, 30 Jun 2002 20:33:59 +0000 (20:33 +0000)]
fix a blunder in osc_connect that frees the connection before
returning success.
braam [Sun, 30 Jun 2002 08:03:40 +0000 (08:03 +0000)]
Fix a few compiler warnings.
braam [Sun, 30 Jun 2002 07:39:40 +0000 (07:39 +0000)]
- except for fixing the segfault in the documentation build, this is
merely a cleanup checkin. Functions in the obdclass system were
given sensible names, old unused headers were removed, and stuff
that isn't used widely was sometimes move from generic headers to
specific subsystems.
pschwan [Sun, 30 Jun 2002 03:22:06 +0000 (03:22 +0000)]
- Fixes memory and refcount links in the DLM
- Fixes a long-standing request leak
- The lock slab is not yet cleaned up correctly in some cases, but I'm on it
braam [Sun, 30 Jun 2002 03:19:02 +0000 (03:19 +0000)]
- small issues
- magic and versions into Lustre messages
- obdo cleanups, remove unused fields.
- small updates to object protocol documentation.
braam [Sat, 29 Jun 2002 13:10:22 +0000 (13:10 +0000)]
File I/O fix: move the lock name space to the filter itself.