X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=blobdiff_plain;f=lustre-iokit%2Fobdfilter-survey%2FREADME.obdfilter-survey;h=747fc10a1d04fad794129d84f770d7eacdff1560;hp=e483a6a7e97bf2b596be8c8dd4e34df5617703d7;hb=6e26d2d715ee65079682f7823f986f61f80eb07c;hpb=29e709dca64d0c475420a4ea8635acebfc901d74 diff --git a/lustre-iokit/obdfilter-survey/README.obdfilter-survey b/lustre-iokit/obdfilter-survey/README.obdfilter-survey index e483a6a..747fc10 100644 --- a/lustre-iokit/obdfilter-survey/README.obdfilter-survey +++ b/lustre-iokit/obdfilter-survey/README.obdfilter-survey @@ -16,9 +16,9 @@ components. You need to tell the script all the names of the obdfilter instances. These should be up and running already . If some are on different - nodes, you need to specify their hostnames too (e.g. node1:ost1). + nodes, you need to specify their hostnames too (e.g. node1:ost1). --OR-- - You just need to pass parameter case=disk to the script. The script will + You just need to pass parameter case=disk to the script. The script will automatically detect the local obdfilter instances. All the obdfilter instances are driven directly. The script @@ -30,10 +30,10 @@ components. Here the script drives one or more instances of obdecho server via instances of echo_client running on 1 or more nodes. - You just need to pass parameters case=network and - targets="" to the script. The script will do the - required setup for network case. - + You just need to pass parameters case=network and + targets="" to the script. The script will do the + required setup for network case. + 3. The Stripe F/S over the Network. Here the script drives one or more instances of obdfilter via instances @@ -48,15 +48,15 @@ components. Note that the script is _NOT_ scalable to 100s of nodes since it is only intended to measure individual servers, not the scalability of the system as a whole. - + Running ------- The script must be customised according to the components under test and where it should keep its working files. Customization variables are described clearly at Customization variables Section in the script. -Please see maximum suported value ranges for customization variables -in the srcipt. +Please see maximum supported value ranges for customization variables +in the script. To run against a local disk: --------------------------- @@ -71,13 +71,13 @@ e.g. : $ nobjhi=2 thrhi=2 size=1024 case=disk sh obdfilter-survey --OR-- 2. Manual run: -- You do not need to specify and MDS or LOV +- You do not need to specify and MDS or LOV - List all OSTs that you wish to test - On all OSS machines: Remember, write tests are destructive! This test should be run prior to startup of your actual Lustre filesystem. If that is the case, you will not need to reformat to restart Lustre - however, if the test is terminated before -completion, you may have to remove objects from the disk. +completion, you may have to remove objects from the disk. - Determine the obdfilter instance names on all the clients, column 4 of 'lctl dl'. For example: @@ -91,7 +91,7 @@ oss02: 0 UP obdfilter oss02-sdi oss02-sdi_UUID 3 Here the obdfilter instance names are oss01-sdb, oss01-sdd, oss02-sdi. Since you are driving obdfilter instances directly, set the shell array -variable 'targets' to the names of the obdfilter instances. +variable 'targets' to the names of the obdfilter instances. Example: @@ -104,18 +104,17 @@ For the second case i.e. obdfilter-survey over network, following setup is to be done. - Install all lustre modules including obdecho. - Start lctl and check for the device list. The device list must be empty. -- It is suggested that there should be passwordless enrty between client - and server machine to avoid typing password. +- It is suggested that there should be passwordless enrty between client + and server machine to avoid typing password. 1. Automated run: - To run obdfilter-surevy against network you just need to pass parameter + To run obdfilter-surevy against network you just need to pass parameter case=netdisk and targets="" to the script. - + e.g. $ nobjhi=2 thrhi=2 size=1024 targets="" \ - case=network sh obdfilter-survey + case=network sh obdfilter-survey -On server side you can see the stats at : - /proc/fs/lustre/obdecho//stats -where, 'echo_srv' is the obdecho server created through script. +On server side you can see the stats with the following command: + lctl get_param obdecho.*.stats NOTE: In network test only automated run is supported. @@ -132,7 +131,7 @@ e.g. : $ nobjhi=2 thrhi=2 size=1024 case=netdisk sh obdfilter-survey While running manually you need to tell the script all the names of the echo_client instances, which should already be up and running. e.g. $ nobjhi=2 thrhi=2 size=1024 targets=" ..." \ - sh obdfilter-survey + sh obdfilter-survey Output files: @@ -162,19 +161,19 @@ Script output The summary file and stdout contain lines like... -ost 8 sz 67108864K rsz 1024 obj 8 thr 8 write 613.54 [ 64.00, 82.00] +ost 8 sz 67108864K rsz 1024K obj 8 thr 8 write 613.54 [ 64.00, 82.00] ost 8 is the total number of OSTs under test. -sz 67108864K is the total amount of data read or written (in KB). -rsz 1024 is the record size (size of each echo_client I/O, in KB). +sz 67108864K is the total amount of data read or written (in bytes). +rsz 1024K is the record size (size of each echo_client I/O, in bytes). obj 8 is the total number of objects over all OSTs thr 8 is the total number of threads over all OSTs and objects write is the test name. If more tests have been specified they - all appear on the same line. + all appear on the same line. 613.54 is the aggregate bandwidth over all OSTs measured by dividing the total number of MB by the elapsed time. [64.00, 82.00] are the minimum and maximum instantaneous bandwidths seen on - any individual OST. + any individual OST. Note that although the numbers of threads and objects are specifed per-OST in the customization section of the script, results are reported aggregated @@ -185,7 +184,7 @@ Visualising Results ------------------- I've found it most useful to import the summary data (it's fixed width) -into gnuplot, Excel (or any graphing package) and graph bandwidth v. +into gnuplot, Excel (or any graphing package) and graph bandwidth v. # threads for varying numbers of concurrent regions. This shows how the OSS performs for a given number of concurrently accessed objects (i.e. files) with varying numbers of I/Os in flight. @@ -194,5 +193,5 @@ It is also extremely useful to record average disk I/O sizes during each test. These numbers help find pathologies in file the file system block allocator and the block device elevator. -The included plot-obdfilter script is an example of processing the output -files to a .csv format and plotting graph using gnuplot. +The included iokit-plot-obdfilter script is an example of processing the +output files to a .csv format and plotting graph using gnuplot.