Friday, January 2, 2009

Measuring IO performance of striped vs spanned lvm logical volumes with amazon EBS and EC2

EBS and LVM
I've been dabbling with amazon's EC2 for a few months now since their EBS (elastic block store) service became available. It allows you to connect any number of virtual devices of up to 1 terabyte each to your EC2 instance and persist the data. When striping across multiple EBS volumes you get superior IO performance than that of the local EC2 file system which I found to be quite cool. In random read and writing situations (databases) this becomes more pronounced.

(read up on lvm terminology here because the following is ultra confusing).

Setting up a striped logical volume
It is best to diagram things before you start and make some good estimates of how much space you need. Here I have 4 EBS devices (I attached them through elastic fox) of 200G each grouped together to form a 420G logical volume mounted to /mnt/data.

"PV" stands for "Physical Volume"

+--[ Volume Group ]--+
| +-----[PV]-----+ |
| |EBS|EBS|EBS|EBS| |
| +--+---+---+---+ |
| | | | |
| | | | |
| +-+---+---+-+ |
| | Logical | |
| | Volume | |
| | | |
| | /mnt/data | |
| +-----------+ |
+--------------------+

So how does one actually get to this point where you are happily processing data at /mnt/data?
(Note: If you are testing you probably don't want to create as many EBS volumes that are as large as mine. You could easily test it only striping 2 devices with a much smaller storage size.)

Create your physical volumes:

# pvcreate /dev/sdh
pvcreate -- physical volume "/dev/sdh" successfully created
# pvcreate /dev/sdi
pvcreate -- physical volume "/dev/sdi" successfully created
# pvcreate /dev/sdj
pvcreate -- physical volume "/dev/sdj" successfully created
# pvcreate /dev/sdk
pvcreate -- physical volume "/dev/sdk" successfully created


Create your volume group:

# vgcreate vg1 /dev/sdh /dev/sdi /dev/sdj /dev/sdk

Create your logical volume (here you can choose striped or not. The -i4 tells lvm to stripe across the devices):

# lvcreate -i4 -I4 -L420G -n data vg1

Create the filesystem on your volume group:

# yes | mkfs -t ext3 /dev/vg1/data

Mount it (note you must do a vgchange -a y /dev/vg1 if it is not active)!

# mkdir /mnt/data
# mount /dev/vg1/data /mnt/data


That is the BASICS of it. To do a non-striped volume just take out the "-i4" in the lvcreate step. Here is a good post on amazon of a striped setup.

Good planning goes a long way with striped volumes
After going through all this it turned out that striped volumes can be a pain to manage. When you run out of space, you can increase the size right? Well...if your volume is striped this isn't so easy and it takes a long time to resize or move anything. I had to take a snapshot of the current devices, increase the size of each EBS device, then extend the logical volume and filesystem to the new extents. I also could have used pvmove to an entirely new set of larger EBS devices but that takes FOREVER. It isn't apples to apples either when increasing the size of your devices. You can't just increase each device by 100G and have a bunch of new storage with a striped volume. lvm has to account for the striping which doesn't allow you to take advantage of all that new space. When sticking to striped volumes it is probably best to just create a brand new logical volume with larger devices and "dd" it to the new setup.

Spanned is easier to grow
On a spanned lvm you can just add a device with x amount of space and extend the logical volume to it. Its quite simple and very flexible. So in my case I actually decided to switch from striped to spanned for future storage needs, which required an overnight "dd" of the logical volumes of the striped one to the non-striped one.
# dd if=/dev/vg1 of=/dev/vg2 bs=1024
I ended up with 2 setups. One striped, one spanned which led me to the reason for this post. The physical extents of the 2 logical volumes were the same so you should be able to compare the tests and see where striped excels over non-striped.

Finally, the data! Notice how on m1.small instances striped is actually slower.

m1.small instance with non-striped logical volume:

Iozone: Performance Test of File I/O
Version $Revision: 3.239 $
Compiled for 32 bit mode.
Build: linux

Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million,
Jean-Marc Zucconi, Jeff Blomberg,
Erik Habbinga, Kris Strecker, Walter Wong.

Run began: Fri Jan 2 11:35:41 2009

Excel chart generation enabled
Record Size 4 KB
File size set to 102400 KB
Command line used: iozone -R -l 5 -u 5 -r 4k -s 100m -F /mnt/data/f1 /mnt/data/f2 /mnt/data/f3 /mnt/data/f4 /mnt/data/f5
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Min process = 5
Max process = 5
Throughput test with 5 processes
Each process writes a 102400 Kbyte file in 4 Kbyte records

Children see throughput for 5 initial writers = 187023.40 KB/sec
Parent sees throughput for 5 initial writers = 16002.41 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 157944.02 KB/sec
Avg throughput per process = 37404.68 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 rewriters = 239558.47 KB/sec
Parent sees throughput for 5 rewriters = 51924.69 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 239558.47 KB/sec
Avg throughput per process = 47911.69 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 readers = 398622.72 KB/sec
Parent sees throughput for 5 readers = 284322.27 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 283237.56 KB/sec
Avg throughput per process = 79724.54 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 re-readers = 392394.52 KB/sec
Parent sees throughput for 5 re-readers = 274542.58 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 265021.97 KB/sec
Avg throughput per process = 78478.90 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 reverse readers = 414629.55 KB/sec
Parent sees throughput for 5 reverse readers = 272372.02 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 292011.69 KB/sec
Avg throughput per process = 82925.91 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 stride readers = 334959.46 KB/sec
Parent sees throughput for 5 stride readers = 248618.43 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 243286.45 KB/sec
Avg throughput per process = 66991.89 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random readers = 370372.30 KB/sec
Parent sees throughput for 5 random readers = 259770.55 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 267395.03 KB/sec
Avg throughput per process = 74074.46 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 mixed workload = 331567.03 KB/sec
Parent sees throughput for 5 mixed workload = 19355.73 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 238917.91 KB/sec
Avg throughput per process = 66313.41 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random writers = 292373.63 KB/sec
Parent sees throughput for 5 random writers = 10503.29 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 196960.98 KB/sec
Avg throughput per process = 58474.73 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 pwrite writers = 154835.77 KB/sec
Parent sees throughput for 5 pwrite writers = 9407.53 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 152950.20 KB/sec
Avg throughput per process = 30967.15 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 pread readers = 363816.23 KB/sec
Parent sees throughput for 5 pread readers = 280161.30 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 251250.73 KB/sec
Avg throughput per process = 72763.25 KB/sec
Min xfer = 0.00 KB



"Throughput report Y-axis is type of test X-axis is number of processes"
"Record size = 4 Kbytes "
"Output is in Kbytes/sec"

" Initial write " 187023.40

" Rewrite " 239558.47

" Read " 398622.72

" Re-read " 392394.52

" Reverse Read " 414629.55

" Stride read " 334959.46

" Random read " 370372.30

" Mixed workload " 331567.03

" Random write " 292373.63

" Pwrite " 154835.77

" Pread " 363816.23


iozone test complete.

m1.small striped volume



Iozone: Performance Test of File I/O
Version $Revision: 3.239 $
Compiled for 32 bit mode.
Build: linux

Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million,
Jean-Marc Zucconi, Jeff Blomberg,
Erik Habbinga, Kris Strecker, Walter Wong.

Run began: Fri Jan 2 12:05:01 2009

Excel chart generation enabled
Record Size 4 KB
File size set to 102400 KB
Command line used: iozone -R -l 5 -u 5 -r 4k -s 100m -F /mnt/data/f1 /mnt/data/f2 /mnt/data/f3 /mnt/data/f4 /mnt/data/f5
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Min process = 5
Max process = 5
Throughput test with 5 processes
Each process writes a 102400 Kbyte file in 4 Kbyte records

Children see throughput for 5 initial writers = 86285.44 KB/sec
Parent sees throughput for 5 initial writers = 28773.78 KB/sec
Min throughput per process = 772.48 KB/sec
Max throughput per process = 42287.14 KB/sec
Avg throughput per process = 17257.09 KB/sec
Min xfer = 1968.00 KB

Children see throughput for 5 rewriters = 209820.41 KB/sec
Parent sees throughput for 5 rewriters = 67799.52 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 183882.92 KB/sec
Avg throughput per process = 41964.08 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 readers = 384281.12 KB/sec
Parent sees throughput for 5 readers = 277150.84 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 278263.12 KB/sec
Avg throughput per process = 76856.22 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 re-readers = 394761.81 KB/sec
Parent sees throughput for 5 re-readers = 291834.15 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 274267.53 KB/sec
Avg throughput per process = 78952.36 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 reverse readers = 372123.29 KB/sec
Parent sees throughput for 5 reverse readers = 256665.24 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 271307.94 KB/sec
Avg throughput per process = 74424.66 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 stride readers = 358034.52 KB/sec
Parent sees throughput for 5 stride readers = 242324.43 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 269100.50 KB/sec
Avg throughput per process = 71606.90 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random readers = 330870.57 KB/sec
Parent sees throughput for 5 random readers = 247094.18 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 241531.08 KB/sec
Avg throughput per process = 66174.11 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 mixed workload = 328983.32 KB/sec
Parent sees throughput for 5 mixed workload = 39118.17 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 242003.17 KB/sec
Avg throughput per process = 65796.66 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random writers = 232506.66 KB/sec
Parent sees throughput for 5 random writers = 20833.49 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 163643.38 KB/sec
Avg throughput per process = 46501.33 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 pwrite writers = 81293.22 KB/sec
Parent sees throughput for 5 pwrite writers = 23243.61 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 34714.85 KB/sec
Avg throughput per process = 16258.64 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 pread readers = 382418.62 KB/sec
Parent sees throughput for 5 pread readers = 271349.91 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 279262.66 KB/sec
Avg throughput per process = 76483.73 KB/sec
Min xfer = 0.00 KB



"Throughput report Y-axis is type of test X-axis is number of processes"
"Record size = 4 Kbytes "
"Output is in Kbytes/sec"

" Initial write " 86285.44

" Rewrite " 209820.41

" Read " 384281.12

" Re-read " 394761.81

" Reverse Read " 372123.29

" Stride read " 358034.52

" Random read " 330870.57

" Mixed workload " 328983.32

" Random write " 232506.66

" Pwrite " 81293.22

" Pread " 382418.62


iozone test complete.

m1.large ec2 instance with non-striped logical volume


Iozone: Performance Test of File I/O
Version $Revision: 3.239 $
Compiled for 64 bit mode.
Build: linux

Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million,
Jean-Marc Zucconi, Jeff Blomberg,
Erik Habbinga, Kris Strecker, Walter Wong.

Run began: Fri Jan 2 11:49:08 2009

Excel chart generation enabled
Record Size 4 KB
File size set to 102400 KB
Command line used: iozone -R -l 5 -u 5 -r 4k -s 100m -F /mnt/data/f1 /mnt/data/f2 /mnt/data/f3 /mnt/data/f4 /mnt/data/f5
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Min process = 5
Max process = 5
Throughput test with 5 processes
Each process writes a 102400 Kbyte file in 4 Kbyte records

Children see throughput for 5 initial writers = 353553.99 KB/sec
Parent sees throughput for 5 initial writers = 11335.09 KB/sec
Min throughput per process = 534.41 KB/sec
Max throughput per process = 346674.53 KB/sec
Avg throughput per process = 70710.80 KB/sec
Min xfer = 172.00 KB

Children see throughput for 5 rewriters = 599928.81 KB/sec
Parent sees throughput for 5 rewriters = 55151.89 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 599928.81 KB/sec
Avg throughput per process = 119985.76 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 readers = 838589.25 KB/sec
Parent sees throughput for 5 readers = 816307.43 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 838589.25 KB/sec
Avg throughput per process = 167717.85 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 re-readers = 788784.06 KB/sec
Parent sees throughput for 5 re-readers = 772101.46 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 788784.06 KB/sec
Avg throughput per process = 157756.81 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 reverse readers = 624249.62 KB/sec
Parent sees throughput for 5 reverse readers = 611193.75 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 624249.62 KB/sec
Avg throughput per process = 124849.93 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 stride readers = 599468.25 KB/sec
Parent sees throughput for 5 stride readers = 587206.39 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 599468.25 KB/sec
Avg throughput per process = 119893.65 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random readers = 873144.40 KB/sec
Parent sees throughput for 5 random readers = 573393.14 KB/sec
Min throughput per process = 43202.99 KB/sec
Max throughput per process = 353544.78 KB/sec
Avg throughput per process = 174628.88 KB/sec
Min xfer = 2580.00 KB

Children see throughput for 5 mixed workload = 602490.56 KB/sec
Parent sees throughput for 5 mixed workload = 11956.89 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 602490.56 KB/sec
Avg throughput per process = 120498.11 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random writers = 471046.94 KB/sec
Parent sees throughput for 5 random writers = 5770.39 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 471046.94 KB/sec
Avg throughput per process = 94209.39 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 pwrite writers = 362352.69 KB/sec
Parent sees throughput for 5 pwrite writers = 15619.58 KB/sec
Min throughput per process = 8620.63 KB/sec
Max throughput per process = 240152.98 KB/sec
Avg throughput per process = 72470.54 KB/sec
Min xfer = 4148.00 KB

Children see throughput for 5 pread readers = 883284.12 KB/sec
Parent sees throughput for 5 pread readers = 855137.12 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 883284.12 KB/sec
Avg throughput per process = 176656.83 KB/sec
Min xfer = 0.00 KB



"Throughput report Y-axis is type of test X-axis is number of processes"
"Record size = 4 Kbytes "
"Output is in Kbytes/sec"

" Initial write " 353553.99

" Rewrite " 599928.81

" Read " 838589.25

" Re-read " 788784.06

" Reverse Read " 624249.62

" Stride read " 599468.25

" Random read " 873144.40

" Mixed workload " 602490.56

" Random write " 471046.94

" Pwrite " 362352.69

" Pread " 883284.12


iozone test complete.

m1.large striped volume

Iozone: Performance Test of File I/O
Version $Revision: 3.239 $
Compiled for 64 bit mode.
Build: linux

Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million,
Jean-Marc Zucconi, Jeff Blomberg,
Erik Habbinga, Kris Strecker, Walter Wong.

Run began: Fri Jan 2 11:56:02 2009

Excel chart generation enabled
Record Size 4 KB
File size set to 102400 KB
Command line used: iozone -R -l 5 -u 5 -r 4k -s 100m -F /mnt/data/f1 /mnt/data/f2 /mnt/data/f3 /mnt/data/f4 /mnt/data/f5
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Min process = 5
Max process = 5
Throughput test with 5 processes
Each process writes a 102400 Kbyte file in 4 Kbyte records

Children see throughput for 5 initial writers = 160335.02 KB/sec
Parent sees throughput for 5 initial writers = 4435.26 KB/sec
Min throughput per process = 0.84 KB/sec
Max throughput per process = 123694.65 KB/sec
Avg throughput per process = 32067.00 KB/sec
Min xfer = 4.00 KB

Children see throughput for 5 rewriters = 596176.44 KB/sec
Parent sees throughput for 5 rewriters = 85744.12 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 596176.44 KB/sec
Avg throughput per process = 119235.29 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 readers = 791896.38 KB/sec
Parent sees throughput for 5 readers = 770880.57 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 791896.38 KB/sec
Avg throughput per process = 158379.27 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 re-readers = 787970.81 KB/sec
Parent sees throughput for 5 re-readers = 767317.71 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 787970.81 KB/sec
Avg throughput per process = 157594.16 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 reverse readers = 696272.12 KB/sec
Parent sees throughput for 5 reverse readers = 679581.49 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 696272.12 KB/sec
Avg throughput per process = 139254.42 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 stride readers = 680316.69 KB/sec
Parent sees throughput for 5 stride readers = 665678.13 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 680316.69 KB/sec
Avg throughput per process = 136063.34 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random readers = 669859.00 KB/sec
Parent sees throughput for 5 random readers = 658161.12 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 669859.00 KB/sec
Avg throughput per process = 133971.80 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 mixed workload = 670494.81 KB/sec
Parent sees throughput for 5 mixed workload = 29356.20 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 670494.81 KB/sec
Avg throughput per process = 134098.96 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 random writers = 492525.84 KB/sec
Parent sees throughput for 5 random writers = 15895.21 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 492525.84 KB/sec
Avg throughput per process = 98505.17 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 pwrite writers = 186032.25 KB/sec
Parent sees throughput for 5 pwrite writers = 3831.68 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 184455.38 KB/sec
Avg throughput per process = 37206.45 KB/sec
Min xfer = 0.00 KB

Children see throughput for 5 pread readers = 878841.56 KB/sec
Parent sees throughput for 5 pread readers = 854188.13 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 878841.56 KB/sec
Avg throughput per process = 175768.31 KB/sec
Min xfer = 0.00 KB



"Throughput report Y-axis is type of test X-axis is number of processes"
"Record size = 4 Kbytes "
"Output is in Kbytes/sec"

" Initial write " 160335.02

" Rewrite " 596176.44

" Read " 791896.38

" Re-read " 787970.81

" Reverse Read " 696272.12

" Stride read " 680316.69

" Random read " 669859.00

" Mixed workload " 670494.81

" Random write " 492525.84

" Pwrite " 186032.25

" Pread " 878841.56


iozone test complete.

1 Comments:

Anonymous Anonymous said...

This is bullshit. learn english

March 6, 2009 11:00 AM  

Post a Comment

<< Home