It's SAMPLE time with vscsiStats
Posted by: hbr in vscsiStats, script, python, esx on
Feb 9, 2011
vscsiStats is a new VMware tool that helps give insight in disk IOs and latency and stuff like that. It’s a very useful tool but it dumps data in histograms. Using these in Excel is quite a pain because Excel knows histograms (with add-ins) but only one at a time. If you want a graph a whole LUN or other lower level part of the storage over time, you need to reformat the data. And luckily, ESX, like most Linux versions nowadays, comes with a python interpreter. Now I know a lot of different scripting languages but python wasn’t one of them. So I decided to write a small python script to do the reformatting for me to allow me to draw nice 3D graphics of block size over time.
Python is kind of like the dotNet for unix. It’s pretty straight forward and very powerful and can connect to almost anything. It is also compatible with Windows and can talk to COM and DCOM objects and even extend them. It’s a very versatile and pretty easy to master language. Once I figured out how to run external programs and capture the output, what the differences between dictionaries, lists and arrays are and how to handle command line arguments, this is the script that I came up with:
#!/usr/bin/python
from subprocess import Popen,PIPE
from datetime import datetime
from threading import Timer
import sys
import time, re, shlexmax = delay = 0
def call(cmd):
return Popen(shlex.split(cmd), stdout=PIPE).communicate()[0]
def callme():
inp = call("/usr/lib/vmware/bin/vscsiStats -l").split('\n')names={}
for ln in inp:
if len(ln)<2: continue
if ln[0]!=' ':
val=re.split(r'[:,]',ln)
names[int(val[1])] = val[3][1:-2]inp = call("/usr/lib/vmware/bin/vscsiStats -p ioLength -c").split('\n')
found=False
wid=0
did=0
x={}
act=''
stamp=datetime.now()for row in inp:
if len(row)<1: break
if found and act != '':
if (row[0].isdigit()):
p=int(row.split(',')[1])
if (x.get(p-1)):
row=str(int(row.split(',')[0]) + int(x[p-1]))
del(x[p-1])
x[p]=row.split(',')[0]
else:
for i in sorted(x.keys()):
print str(stamp)[11:22] + ',' + str(delay) + ',' + names[int(wod)] + ',' + did + ',' + act + ',' + str(i) + "," + str(x[i])
x.clear()
if row[0] == 'F':
found=True
if row[0] == 'H':
found=False
if row[25] == 'R' or row[25] == 'W':
act=row[25]
wod=row.split(',')[2]
did=row.split(',')[4]
else:
act=''
call("/usr/lib/vmware/bin/vscsiStats -r")if len(sys.argv)<2:
sys.exit( sys.argv[0] + " \n")try:
max = int(sys.argv[2])
delay = int(sys.argv[1])
except:
sys.exit( "error in arguments")call("/usr/lib/vmware/bin/vscsiStats -s")
call("/usr/lib/vmware/bin/vscsiStats -r")
sys.stderr.write('taking ' + str(max) + ' samples with ' + str(delay) + 's intervals:\n')
print "time,sampletime,VM,diskID,operation,blocksize,amount"
count=1
while count <= max :
time.sleep(delay)
callme();
sys.stderr.write('sample ' + str(count) + ' of ' + str(max) +
' done. eta: ' + str(((max-count)*delay)/60) + 'm' + str(((max-count)*delay)%60) + 's...\n')
count+=1call("/usr/lib/vmware/bin/vscsiStats -x")
And that’s it. Save it as ‘export.py’ and run it with ‘./export.py 90 20 > output.csv’ to get 20 samples at 90 second intervals of disk blocks from all running VMs on this host, redirecting it to a csv file. For statistically correct data you would need between 300 and 500 samples but since vscsiStats doesn’t really sample but simply counts all IOs since the last reset, 20 samples will suffice.
Now the Excel bit.
Use smbclient on ESX or winscp to move the csv file to a Windows machine with Excel on it. Open it, select column ‘A’, go to Data and hit ‘Text to Columns', delimiter is a comma. Next, insert a PivotChart using column A-G and select ‘time’,’blocksize’ and ‘amount’. Move blocksize to 'Legend Fields’ and amount to Values. Change ‘Count of amount’ to ‘Sum of amount’ using ‘Value Field Settings’. Then select the graph and change the chart type to ‘Surface’. If like mine, the 4k blocks are very high, you could rotate the X-axis a bit and you’ll end up with something like this:
And with this graph and the PivotChart we can start drawing all sorts of wild conclusions.
But one thing is clear, vscsiStats and python both are pretty cool ![]()
