1

Greenplum vs Hadoop Disk Space

Posted by scottk on May 24, 2010 in Ramblings |

I’ve been spending a whole lot of time calculating Greenplum vs Hadoop disk usage. So here the general equation

(MaxAllocFactor * DiskSize * ( #Disk – RaidDisks ) ) / ReplicationFactor

MaxAllocFactor = Max recommended allocation. 70% for Greenplum and 75% for Hadoop

DiskSize = Size of your drive

#Disk = Number of drives

RaidDisks = Number disk eaten up by RAID, for Hadoop this is 0

ReplicationFactor = Greenplum everything is mirrored for replication factor is 2. Hadoop recommends three copies of data thus it gets a replication factor of 3.

So let’s look at a 24 drive array attached storage, we’ll use 500GB drives.

(MaxAllocFactor * DiskSize * ( #Disk - RaidDisks ) ) / ReplicationFactor
Greenplum: ( .70 * 500GB * ( 24 - 4 ) ) / 2 = 3.5 TB effective space
Hadoop: ( .75 * 500GB * ( 24 - 0 ) ) / 3 = 3.0 TB effective space

Next we’ll look at single server, let’s say a 1U with 4 3.5″ 2TB drives

Greenplum: ( .70 * 2TB * ( 4 - 1 ) ) / 2 = 2.1 TB effective space
Hadoop: ( .75 * 2TB * ( 4 - 0 ) ) / 3 = 2 TB effective space

How about a single 2U server with 12 1TB drives

Greenplum: ( .70 * 1TB * ( 12 - 2 ) ) / 2 = 3.5 TB effective space
Hadoop: ( .75 * 1TB * ( 12 - 0 ) ) / 3 = 3 TB effective space

So what does this mean? It means that you shouldn’t run laughing to the bank on your backend savings by choosing Hadoop over Greenplum, given you plan to use the same storage architecture. Greenplum and Hadoop are two very different technologies so comparing the two is kind of silly in the first place. They fall into the same category of processing large datasets in the same manner that a Ford F350 and Mazda Miata are both cars. They will both get you down that road, but in an entirely different manner.

Don’t talk to me about compression factors, everyone wants to say how their grandmother in Pensacola got 20x compression on system X. System X never happens to be my system, so I’ve stopped drinking the compression factor koolaid.

Tags: , ,

1 Comment

Comments are closed. Would you like to contact the author directly?

Copyright © 2006-2024 SimpIT.com All rights reserved.
This site is using the Desk Mess Mirrored theme, v2.5, from BuyNowShop.com.