Test Driving EMC CloudArray CIFS + S3
After hearing the CloudArray story a couple of days ago, I felt compelled to see it in action. And what better way than trying it out?
The neat thing about software solutions is that you can download them and try them out for FREE (well, most things…EMC is still working on this in many departments but I think we’ll eventually figure it out). Getting the bits in this case was easy — just go to www.cloudarray.com.
A few things you’re going to want:
- VMWare ESXi or Hyper-V (to deploy the CloudArray VM)
- A cloud storage provider account (I used S3)
- Access key and secret for the above account
- The CloudArray VM bits
The initial setup was pretty easy — import the .OVA, power the machine on. Then navigate to the specified URL for initialization.
Once you set it up (apologize for skipping the screens here, it is super easy), you’re greeted to the main dashboard. I had configured CIFS to test the workflow for a Windows file environment — I named it “Share 1". I like to use highly descriptive names for things.
Sure enough, the share existed and I could navigate to it.
I wanted to see if anything had happened in AWS S3 yet, and sure enough CloudArray created its own bucket (it’s the one that is prefixed with twinstrata). It also created a bunch of small objects — my guess would be configuration info and metadata.
Now time to copy some files to it and see what happens. I grabbed a couple of ISOs from a NAS4Free system I have running at home and began a copy to the CloudArray share.
At 6 percent copied (and it was a painfully slow copy, probably because I’m using crummy infra) replication to S3 began as reflected by the CloudArray dashboard.
Looking at the S3 bucket, it began filling up with objects that were about 1MB in size. I spent a few minutes wondering how they decided on the object size, but then it hit me — I set the cache pages to 1MB. CloudArray was simply replicating each page as an object to S3. Interesting thing to note that these are clearly not object representations of the 2 ISOs that existed on my share, but rather chunks of those two files. This means that in order to access them in the future CloudArray would be required (and it’s really not that big of a deal — just as I’m copying from a NAS to CloudArray, I can just as easily do the reverse if I ever wanted off).
As the copy and subsequent replication progressed, the CloudArray cache slowly started filling. Hilariously, the data transfer to S3 seemed to be going faster than the copy over my home network.
Once the copy finished, the CloudArray dashboard looked like the below. I also tested reading from the share to see if it would pull any objects from S3. As expected, it did not — everything I had just written was still in cache so it was pulled straight from there.
So the verdict…it works! Now I wanted to try and prove out an idea I had for cloud and remote site access — that is, if your main production site produces some files or artifacts and replicates them to cloud storage, you should be able to access those same files “locally” in AWS (or other provider). And by placing a CloudArray at one or many remote sites, you should also be able to access those same files and benefit from the cache for frequent reads. It could be very interesting in globally distributed workflows (software development and testing, to name one). So I went to AWS to see if we had an AMI for CloudArray and…
Yay! Looks like there is one. Selected it, went to launch and…
Boo! As it turns out, this image had been removed by TwinStrata (probably after the acquisition) and is no longer usable on AWS. I should be able to work around that using the EC2 API or AWS Console for vCenter tools — but I’ll save that for next time.