lustre-iokit/obdfilter-survey/README.obdfilter-survey

   1 Overview
   2 --------
   3
   4 This survey script does sequential I/O with varying numbers of threads and
   5 objects (files) by using lctl to drive the echo_client connected
   6 to local or remote obdfilter instances, or remote obdecho instances.
   7
   8 It can be used to characterise the performance of the following lustre
   9 components.
  10
  11 1. The Object Storage Targets.
  12
  13    Here the script directly exercises one or more instances of obdfilter.
  14    They may be running on 1 or more nodes, e.g. when they are all attached
  15    to the same multi-ported disk subsystem.
  16
  17    You need to tell the script all the names of the obdfilter instances.
  18    These should be up and running already .  If some are on different
  19    nodes, you need to specify their hostnames too (e.g. node1:ost1).
  20    --OR--
  21    You just need to pass parameter case=disk to the script. The script will
  22    automatically detect the local obdfilter instances.
  23
  24    All the obdfilter instances are driven directly.  The script
  25    automatically loads the obdecho module if required and creates one
  26    instance of echo_client for each obdfilter instance.
  27
  28 2. The Network.
  29
  30    Here the script drives one or more instances of obdecho server via instances
  31    of echo_client running on 1 or more nodes.
  32
  33    You just need to pass parameters case=network and
  34    server_nid="<name/nid_of_server>" to the script. The script will do the
  35    required setup for network case.
  36
  37 3. The Stripe F/S over the Network.
  38
  39    Here the script drives one or more instances of obdfilter via instances
  40    of echo_client running on 1 or more nodes.
  41
  42    You need to tell the script all the names of the OSC's, which should be
  43    up and running.
  44    --OR--
  45    You just need to pass parameter case=netdisk to the script. The script will
  46    use all of the local OSCs.
  47
  48 Note that the script is _NOT_ scalable to 100s of nodes since it is only
  49 intended to measure individual servers, not the scalability of the system
  50 as a whole.
  51
  52 Running
  53 -------
  54
  55 The script must be customised according to the components under test and
  56 where it should keep its working files.  Customization variables are
  57 described clearly at Customization variables Section  in the script.
  58
  59 To run against a local disk:
  60 ---------------------------
  61 - Create a Lustre configuraton using your normal methods
  62
  63 1. Automated run:
  64 Setup the Lustre filesystem with required OST's. Make sure that obdecho.ko
  65 module is present. Then invoke the obdfilter-survey script with parameter
  66 case=disk.
  67 e.g. : $ nobjhi=2 thrhi=2 size=1024 case=disk sh obdfilter-survey
  68
  69 --OR--
  70
  71 2. Manual run:
  72 - You do not need to specify and MDS or LOV
  73 - List all OSTs that you wish to test
  74 - On all OSS machines:
  75   Remember, write tests are destructive! This test should be run prior to
  76 startup of your actual Lustre filesystem. If that is the case, you will not
  77 need to reformat to restart Lustre - however, if the test is terminated before
  78 completion, you may have to remove objects from the disk.
  79
  80 - Determine the obdfilter instance names on all the clients, column 4
  81 of 'lctl dl'.  For example:
  82
  83 # pdsh -w oss[01-02] lctl dl |grep obdfilter |sort
  84 oss01:   0 UP obdfilter oss01-sdb oss01-sdb_UUID 3
  85 oss01:   2 UP obdfilter oss01-sdd oss01-sdd_UUID 3
  86 oss02:   0 UP obdfilter oss02-sdi oss02-sdi_UUID 3
  87 ...
  88
  89 Here the obdfilter instance names are oss01-sdb, oss01-sdd, oss02-sdi.
  90
  91 Since you are driving obdfilter instances directly, set the shell array
  92 variable 'ost_names' to the names of the obdfilter instances and leave
  93 'ECHO_CLIENTS' undefined.
  94 Example:
  95
  96 OSTS='oss01:oss01-sdb oss01:oss01-sdd oss02:oss02-sdi' \
  97    ./obdfilter-survey
  98
  99
 100 To run against a network:
 101 ------------------------
 102 For the second case i.e. obdfilter-survey over network, following setup
 103 is to be done.
 104 - Install all lustre modules including obdecho.
 105 - Start lctl and check for the device list. The device list must be empty.
 106 - It is suggested that there should be passwordless enrty between client
 107   and server machine to avoid typing password.
 108 1. Automated run:
 109    To run obdfilter-surevy against network you just need to pass parameter
 110    case=netdisk and server_nid="<name/nid_of_server>" to the script.
 111
 112 e.g. $ nobjhi=2 thrhi=2 size=1024 server_nid="<name/nid_of_server>" \
 113        case=network sh obdfilter-survey
 114
 115 On server side you can see the stats at :
 116         /proc/fs/lustre/obdecho/<ost-testfs>/ststs
 117 where, 'ost_testfs' is the obdecho server created through script.
 118
 119 NOTE: In network test only automated run is supported.
 120
 121 To run against network-disk:
 122 ----------------------------
 123 - Create a Lustre configuraton using your normal methods
 124
 125 1. Automated run:
 126 Setup the lustre with required OST's. Make sure that obdecho.ko module is
 127 present. Then invoke the obdfilter-survey script with parameter case=netdisk.
 128 e.g. : $ nobjhi=2 thrhi=2 size=1024 case=netdisk sh obdfilter-survey
 129
 130 2. Manual run:
 131 While running manually you need to tell the script all the names of the
 132 echo_client instances, which should already be up and running.
 133 e.g. $ nobjhi=2 thrhi=2 size=1024 ECHO_CLIENTS="ECHO_<osc_name> ..." \
 134        sh obdfilter-survey
 135
 136
 137 Output files:
 138 -------------
 139
 140 When the script runs, it creates a number of working files and a pair of
 141 result files.  All files start with the prefix given by ${rslt}.
 142
 143 ${rslt}.summary           same as stdout
 144 ${rslt}.script_*          per-host test script files
 145 ${rslt}.detail_tmp*       per-ost result files
 146 ${rslt}.detail            collected result files for post-mortem
 147
 148 The script iterates over the given numbers of threads and objects
 149 performing all the specified tests and checking that all test processes
 150 completed successfully.
 151
 152 Note that the script may not clean up properly if it is aborted or if it
 153 encounters an unrecoverable error.  In this case, manual cleanup may be
 154 required, possibly including killing any running instances of 'lctl' (local
 155 or remote), removing echo_client instances created by the script and
 156 unloading obdecho.
 157
 158
 159 Script output
 160 -------------
 161
 162 The summary file and stdout contain lines like...
 163
 164 ost 8 sz 67108864K rsz 1024 obj    8 thr    8 write  613.54 [ 64.00, 82.00]
 165
 166 ost 8          is the total number of OSTs under test.
 167 sz 67108864K   is the total amount of data read or written (in KB).
 168 rsz 1024       is the record size (size of each echo_client I/O, in KB).
 169 obj    8       is the total number of objects over all OSTs
 170 thr    8       is the total number of threads over all OSTs and objects
 171 write          is the test name.  If more tests have been specified they
 172                all appear on the same line.
 173 613.54         is the aggregate bandwidth over all OSTs measured by
 174                dividing the total number of MB by the elapsed time.
 175 [64.00, 82.00] are the minimum and maximum instantaneous bandwidths seen on
 176                any individual OST.
 177
 178 Note that although the numbers of threads and objects are specifed per-OST
 179 in the customization section of the script, results are reported aggregated
 180 over all OSTs.
 181
 182
 183 Visualising Results
 184 -------------------
 185
 186 I've found it most useful to import the summary data (it's fixed width)
 187 into gnuplot, Excel (or any graphing package) and graph bandwidth v.
 188 # threads for varying numbers of concurrent regions.  This shows how
 189 the OSS performs for a given number of concurrently accessed objects
 190 (i.e. files) with varying numbers of I/Os in flight.
 191
 192 It is also extremely useful to record average disk I/O sizes during each
 193 test.  These numbers help find pathologies in file the file system block
 194 allocator and the block device elevator.
 195
 196 The included plot-obdfilter script is an example of processing the output
 197 files to a .csv format and plotting graph using gnuplot.