X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=blobdiff_plain;f=lustre-iokit%2Fobdfilter-survey%2FREADME;h=cea897af5f9f1185fc825d1a9d8f8dea6f52cf2e;hp=304672356db06144b1e6356b6e657426573a55f7;hb=024aea8216330cb77e89d43dcbb758614e58ac37;hpb=d03c475e63cac8beeb20613d2e42881ae417eeb2 diff --git a/lustre-iokit/obdfilter-survey/README b/lustre-iokit/obdfilter-survey/README index 3046723..cea897a 100644 --- a/lustre-iokit/obdfilter-survey/README +++ b/lustre-iokit/obdfilter-survey/README @@ -1,45 +1,77 @@ -Requirements ------------- +Overview +-------- -. lustre OSS up and running +This survey script does sequential I/O with varying numbers of threads and +objects (files) by using lctl::test_brw to drive the echo_client connected +to local or remote obdfilter instances, or remote obdecho instances. +It can be used to characterise the performance of the following lustre +components. -Overview --------- +1. The Stripe F/S. + + Here the script directly exercises one or more instances of obdfilter. + They may be running on 1 or more nodes, e.g. when they are all attached + to the same multi-ported disk subsystem. + + You need to tell the script all the names of the obdfilter instances. + These should be up and running already . If some are on different + nodes, you need to specify their hostnames too (e.g. node1:ost1). -This survey may be used to characterise the performance of a lustre OSS. -It can exercise the OSS either locally or remotely via the network. + All the obdfilter instances are driven directly. The script + automatically loads the obdecho module if required and creates one + instance of echo_client for each obdfilter instance. -The script uses lctl::test_brw to drive the echo_client doing sequential -I/O with varying numbers of threads and objects. One instance of lctl is -spawned for each OST. +2. The Network. + Here the script drives one or more instances of obdecho via instances of + echo_client running on 1 or more nodes. + You need to tell the script all the names of the echo_client instances. + These should already be up and running. If some are on different nodes, + you need to specify their hostnames too (e.g. node1:ECHO_node1). + +3. The Stripe F/S over the Network. + + Here the script drives one or more instances of obdfilter via instances + of echo_client running on 1 or more nodes. + + As with (2), you need to tell the script all the names of the + echo_client instances, which should already be up and running. + +Note that the script is _NOT_ scalable to 100s of nodes since it is only +intended to measure individual servers, not the scalability of the system +as a whole. + + Running ------- -The script must be customised according to the particular device under test -and where it should keep its working files. Customisation variables are +The script must be customised according to the components under test and +where it should keep its working files. Customisation variables are described clearly at the start of the script. -When the script runs, it creates a number of working files and a pair of -result files. All files start with the prefix given by ${rslt}. - -${rslt}_.summary same as stdout -${rslt}_.detail_tmp* tmp files -${rslt}_.detail collected tmp files for post-mortem +If you are driving obdfilter instances directly, set the shell array +variable 'ost_names' to the names of the obdfilter instances and leave +'client_names' undefined. -The script iterates over the given numbers of threads and objects -performing all the specified tests and checking that all test processes -completed successfully. +If you are driving obdfilter or obdecho instances over the network, you +must instantiate the echo_clients yourself using lmc/lconf. Set the shell +array variable 'client_names' to the names of the echo_client instances and +leave 'ost_names' undefined. +You can optionally prefix any name in 'ost_names' or 'client_names' with +the hostname that it is running on (e.g. remote_node:ost4) if your +obdfilters or echo_clients are running on more than one node. In this +case, you need to ensure... -Local OSS ---------- +(a) 'custom_remote_shell()' works on your cluster +(b) all pathnames you specify in the script are mounted on the node you + start the survey from and all the remote nodes. -To test a local OSS, setup 'ost_names' with the names of each OST. If you -are unsure, do 'lctl device_list' and looks for obdfilter instanced e.g... +Use 'lctl device_list' to verify the obdfilter/echo_client instance names +e.g... [root@ns9 root]# lctl device_list 0 UP confobd conf_ost3 OSD_ost3_ns9_UUID 1 @@ -48,19 +80,27 @@ are unsure, do 'lctl device_list' and looks for obdfilter instanced e.g... 3 AT confobd conf_ost12 OSD_ost12_ns9_UUID 1 [root@ns9 root]# -Here device number 1 is an obdfilter instance called 'ost3'. +...here device 1 is an instance of obdfilter called 'ost3'. To exercise it +directly, add 'ns9:ost3' to 'ost_names'. If the script is only to be run +on node 'ns9' you could simply add 'ost3' to 'ost_names'. -The script configures an instance of echo_client for each name in ost_names -and tears it down on normal completion. Note that it does NOT clean up -properly (i.e. manual cleanup is required) if it is not allowed to run to -completion. +When the script runs, it creates a number of working files and a pair of +result files. All files start with the prefix given by ${rslt}. +${rslt}.summary same as stdout +${rslt}.script_* per-host test script files +${rslt}.detail_tmp* per-ost result files +${rslt}.detail collected result files for post-mortem -Remote OSS ----------- +The script iterates over the given numbers of threads and objects +performing all the specified tests and checking that all test processes +completed successfully. -To test OSS performance over the network, you need to create a lustre -configuration that creates echo_client instances for each OST. +Note that the script does NOT clean up properly if it is aborted or if it +encounters an unrecoverable error. In this case, manual cleanup may be +required, possibly including killing any running instances of 'lctl' (local +or remote), removing echo_client instances created by the script and +unloading obdecho. Script output @@ -92,7 +132,10 @@ Visualising Results I've found it most useful to import the summary data (it's fixed width) into Excel (or any graphing package) and graph bandwidth v. # threads for -varying numbers of concurrent regions. This shows how the device performs -with varying queue depth. If the series (varying numbers of concurrent -regions) all seem to land on top of each other, it shows the device is -phased by seeks at the given record size. +varying numbers of concurrent regions. This shows how the OSS performs for +a given number of concurrently accessed objects (i.e. files) with varying +numbers of I/Os in flight. + +It is also extremely useful to record average disk I/O sizes during each +test. These numbers help find pathologies in file the file system block +allocator and the block device elevator.