Wednesday, December 15, 2010

A tool for analyzing Storage Performance - SWAT from Oracle

Storage is probably the most commonly found bottleneck in  majority of the databases.

Most Database Administrators are not Storage Engineers and vice-versa.  For a Database Administrator, translating a Database requirement/problem into Storage terminology is not easy - even with all the stats available from the Oracle Database.  

Generally, there are 2 common storage issues faced - performance problems and capacity planning related.

As regards performance, rather than saying  "I am seeing a lot of storage related waits , so Storage is slow, can you please fix it" -  it is preferable to talk to your Storage Admin in a language he can understand.

Remember, Storage Engineers look at storage performance from a LUN perspective and not from an application stats perspective. They would be least concerned about log file parallel write wait times being 20ms. On the other hand, if you tell them that LUN 10 and 11 (being used for redo) are write intensive with a lot of small block IOPS and you am seeing high response times, it makes perfect sense to them.

From a capacity planning and performance trending perspective, how are we to actually track changes in storage performance over a period of time? Capacity planning is not just about free space, it is an overall look at the health of your storage subsystem with respect to increasing application demands.

Since it is the Database Administrator making the request, it would probably make sense for the DBA to learn Storage terminology. Knowledge is power and knowing your requirements well would make all the difference and possibly earn the respect of your Storage Engineer.

Most Storage Engineer generally look for the below with regards to addressing performance issues and capacity planning (today and future)

  1. Size of the working Data Set
  2. Number of IOPS being generated
  3. Size of an IOP
  4. % Reads versus % Writes
  5. Expected Response Time versus actuals
  6. Data Rate or Bandwidth consumed today
  7. And so on.

We can get into a lot more details into each of the above line items. To keep it simple and begin a conversation , this would suffice.

How are we to obtain this information from an Oracle Database?

Yes, it is possible to get this information from the various v$views and stats, however what we need is a simple tool which can monitor the system and collect/collate/trend all this information in a meaningful manner.

There are commercial tools available, however we need something which is lightweight, flexible and potentially available at no added cost.

SWAT is a tool from Oracle and for the features and capabilities it offers, it is very light weight and efficient. 
SWAT stands for Sun StorageTek Workload Analysis Tool. There is no equivalent to SWAT that I have seen.  Incredibly enough, it is not very well known.

It takes a few minutes to set it up and can run as a non-privileged user. As long as java is installed, you are good to go. Supported on Solaris, Linux and Windows systems.

It can be setup to run continuously in the background and so is invaluable for capacity planning requirements.

To get a copy of SWAT, I would suggest that you approach your Oracle Support contact. The author of SWAT is Henk Vandenbergh. He is also the author for vdbench which is an excellent tool for testing IO Subsystems.

Some screen shots from SWAT.

1 comment:

MarkW said...

Really! SWAT is very good tool for analyzing storage performance. I like this article and way of the information presented.