NetApp + SolidFire…or SolidFire + NetApp?

So what just happened?

First- we just saw AMAZING execution of an acquisition.  No BS.  No wavering.  NetApp just GOT IT DONE, months ahead of schedule.  This is right in-line with George Kurian’s reputation of excellent execution.  This mitigated any doubt, any haziness, and gets everyone moving towards their strategic goals.  When viewed against other tech mergers currently in motion, it gives customers and partners comfort to know that they’re not in limbo and can make decisions with confidence.  (Of course, it’s a relatively small, all-cash deal- not a merger of behemoths).

Second -NetApp just got YOUNGER.  Not younger in age, but younger in technical thought.  SolidFire’s foundational architecture is based on scalable, commodity-hardware cloud storage, with extreme competency in OpenStack.  The technology is completely different than OnTAP, and provides a platform for service providers that is extremely hard to match.   OnTAP’s foundational architecture is based on purpose-built appliances that perform scalable enterprise data services, that now extend to hybrid cloud deployments.  Two different markets.  SolidFire’s platform went to market in 2010, 19 years after OnTAP was invented – and both were built to solve the problems of the day in the most efficient, scalable, and manageable way.

Third – NetApp either just made themselves more attractive to buyers, or LESS attractive, depending on how you look at it.

One could claim they’re more attractive now as their stock price is still relatively depressed, and they’re set up to attack the only storage markets that will exist in 5-10 years, those being the Enterprise/Hybrid Cloud market and the Service Provider/SaaS market.  Anyone still focusing on SMB/MSE storage in 5-10 years will find nothing but the remnants of a market that has moved all of its data and applications to the cloud.

Alternatively, one could suggest a wait-and-see approach to the SolidFire acquisition, as well as the other major changes NetApp has made to its portfolio over the last year (AFF, AltaVault, cloud integration endeavors, as well as all the things it STOPPED doing). [Side note: with 16TB SSD drives coming, look for AFF to give competitors like Pure and xTremeIO some troubles.]

So let’s discuss what ISN’T going to happen.

There is NO WAY that NetApp is going to shove SolidFire into the OnTAP platform.  Anyone who is putting that out there hasn’t done their homework to understand the foundational architectures of the VERY DIFFERENT two technologies.  Also, what would possibly be gained by doing so?   In contrast, Spinnaker had technology that could let OnTAP escape from its two-controller bifurcated storage boundaries.  The plan from the beginning was to use the SpinFS goodness to create a non-disruptive, no-boundaries platform for scalable and holistic enterprise storage, with all the data services that entailed.

What could (and should) happen is that NetApp add some Data Fabric goodness into the SF product- perhaps this concept is what is confusing the self-described technorati in the web rags.  NetApp re-wrote and opened up the LRSE (SnapMirror) technology so that it could move data among multiple platforms, so this wouldn’t be a deep integration, but rather an “edge” integration, and the same is being worked into the AltaVault and StorageGRID platforms to create a holistic and flexible data ecosystem that can meet any need conceivable.

While SolidFire could absolutely be used for enterprise storage, its natural market is the service provider who needs to simply plug and grow (or pull and shrink).  Perhaps there could be a feature or two that the NetApp and SF development teams could share over coffee (I’ve heard that the FAS and FlashRay teams had such an event that resulted in a major improvement for AFF), but that can only be a good thing.  However the integration of the two platforms isn’t in anyone’s interests, and everyone I’ve spoken to at NetApp both on and off the record are adamant that Netapp isn’t going to “OnTAP” the SolidFire platform.

SolidFire will likely continue to operate as a separate entity for quite a while, as sales groups to service providers are already distinct from the enterprise/commercial sales groups at NetApp.  Since OnTAP knowledge won’t be able to be leveraged when dealing with SolidFire, I would expect that existing NetApp channel partners won’t be encouraged to start pushing the SF platform until they’ve demonstrated both SF and OpenStack chops.  I would also expect the reverse to be true; while many of SolidFire’s partners are already NetApp partners, it’s unknown how many have Clustered OnTAP knowledge.

I don’t see this acquisition as a monumental event that has immediately demonstrable external impact to the industry, or either company.  The benefits will become evident 12-18 months out and position NetApp for long-term success, viz-a-viz “flash in the pan” storage companies that will find their runway much shorter than expected in the 3-4 year timeframe.  As usual, NetApp took the long view.  Those who see this as a “hail-mary” to rescue NetApp from a “failed” flash play aren’t understanding the market dynamics at work.  We won’t be able to measure the success of the SolidFire acquisition for a good 3-4 years; not because of any integration that’s required (like the Spinnaker deal), but because the bet is on how the market is changing and where it will be at that point – with this acquisition, NetApp is betting it will be the best-positioned to meet those needs.

 

OS X built-in emergency HTTP server

Sometimes you need to serve up a file via HTTP, for instance when upgrading a NetApp Cluster via the Automated method.

With Windows, I always had to install a lightweight HTTP Server, or install IIS (ugh).

I have a MacBook Pro, and I don’t have to install anything!

I can just CD to the directory that the desired file is in, and issue the following command:

python -mSimpleHTTPServer xxxx

where “xxxx” is the TCP Port number you want your laptop to listen on for the HTTP request.  By default, it will use port 8000.

Just make sure you CTRL-C when you’re done, or you’re going to leave a wide-open entryway into that directory.

Hope you find this helpful, I know I did when I learned it!

NetApp connects Hadoop to NFS

The link to Val Bercovici’s article is here.

Here’s the gist-

Hadoop natively uses HDFS, which is a file system that’s made to be node-level redundant. Data is replicated by default THREE times across nodes in a Hadoop cluster.  The nodes themselves, at least in the “tradition” of Hadoop, do not perform any RAID at all, if node’s filesystem fails, the data is already contained elsewhere and any running MapReduce jobs are simply started over.

This is great if you have a few thousand nodes and the people you’re crunching data for are at-large consumers who aren’t paying for your service and as such cannot expect service levels of any kind.

Enterprise, however, is a different story.  Once business units start depending on reduced results from Hadoop, they start depending on the timeframe in which it’s delivered as well.  Simply starting jobs over is NOT going to please anyone and could interrupt business processes.  Further, Enterprises don’t have the space or budget to put up Hadoop clusters with the scale the Facebooks and Yahoos do (they also don’t typically have the justifiable use cases). In fact, the Enterprises I’m working with are taking a “build it and the use cases will come” approach to Hadoop.

NetApp’s NFS connector for Hadoop significantly reduces the entry point for businesses who want to vet out Hadoop and justify use cases.  One of the traditional problems with Hadoop is that one needs to create a silo’ed architecture- servers, storage, and network, in a scale that prove the worth of Hadoop.

Now, businesses can throw compute (physical OR virtual) into a Hadoop cluster and connect to existing NFS datastores – whether they are on NetApp or not!   NetApp has created this connector and thrown it upon the world as open source on GitHub.

This removes a huge barrier to entry for any NetApp (or NFS!) customer who is looking to perform analytics against an existing dataset without moving it or creating duplicate copies.

Great move.

Rant #1: Data-At-Rest Encryption

Subtitled: Data-at-rest encryption compare and contrast, Netapp FAS & VNX2

So every once in a while I run across a situation at a client where I get to see the real differences in approach between technology manufacturers.

The specific focus is on data-at-rest encryption.  Encrypting data that resides on hard drives is a good practice, provided it can be done cost-effectively with a minimal effect on performance, availability, and administrative complexity. Probably the best reason I can come up with for implementing data-at-rest encryption is the drive failure case- you’re expected to send back that ‘failed’ hard drive to the manufacturer, where they will mount that drive to see just how ‘failed’ it really is.  Point is, you’ve still got data on there. If you’re a service provider, you’ve got your client’s data on there. Not good. Unless you’ve got an industrial-grade degausser, you don’t have many options here.  Some manufacturers have a special support upgrade that lets you throw away failed drives instead of return them, but that’s a significant bump up in cost.

OK, so now you’ve decided, sure, I want to do data-at-rest encryption.  Great!  Turn it on!

Not so fast.

The most important object in the world of encryption is the key.  Without the key, all that business-saving data you have on that expensive enterprise-storage solution is useless.  Therefore, you need to implement a key-management solution to make sure that keys are rotated, backed up, remembered, and most importantly available for a secure restore.  The key management solution, like every other important piece of IT equipment, needs a companion in DR in case the first one takes a dive.

Wait, secure restore?  Well, what’s the point of encrypting data if you make it super-simple to steal the data and then decrypt it?  Most enterprise-grade key management solutions implement quorums for key management operations, complete with smart cards, etc.   This helps prevent the all-too-often occurrence of “inside man” data theft.

NetApp’s answer to data-at-rest encryption is the self-encrypting drive.  The drives themselves perform all the encryption work, and pass data to the controller as any drive would.  The biggest caveat here is that all drives in a Netapp HA pair must be of the NSE drive type, and you can’t use Flash Pool or MetroCluster.

Netapp partners with a couple of key management vendors, but OEM the SafeNet KeySecure solution for key management.  Having the keys stored off-box ensures that if some bad guy wheels away with your entire storage device, they’ve got nuthin’.  It won’t boot without being able to reach the key management system.  SafeNet’s KeySecure adheres to many (if not all) industry standards, and can simultaneously manage keys for other storage and PKI-enabled infrastructure.  I consider this approach to be strategic as it thinks holistically- one place to manage keys across a wide array of resources.  Scalable administration, high-availability, maximum security.  Peachy.

I had the opportunity to contrast approach this against EMC’s VNX2 data-at-rest solution.  EMC took its typical tactical approach, as I will outline next.

Instead of using encrypting drives, they chose the path of the encrypting SAS adapter – they use PMC-Sierra’s Tachyon adapter for this with the inline ASIC-based encryption.  So it’s important to note that this encryption technology has nothing to actually do with VNX2. More on this architectural choice later.

Where the encryption is done in a solution is equally as important as the key management portion of the solution- without the keys, you’re toast.  EMC, owner of RSA, took a version of RSA key management software and implemented it in the storage processors of the VNX2.  This is something they tout as a great feature- “embedded key management”.   The problem is, they have totally missed the point.  Having a key manager on the box that contains the data you’re encrypting is a little like leaving your keys in your car.  If someone takes the entire array, they have your data.  Doesn’t this go against the very notion of encrypting data? Sure, you’re still protected from someone swiping a drive.  But entire systems get returned off lease all the time.

Now, of course if you’ve got a VNX2, you’ve got a second one in DR.  That box has its OWN embedded key manager.  Great.  Now I’ve got TWO key managers to backup every time I make a drive config change to the system (and if I change one, I’m probably changing the other).

What?  You say you don’t like this and you want to use a third-party key manager?  Nope.  VNX2 will NOT support any third-party, industry-standard compliant key manager.  You’re stuck with the embedded one.  This embedded key manager sounds like more of an albatross than a feature.  Quite frankly, I’m very surprised that EMC would limit clients in this way, as the PMC-Sierra encrypting technology that’s in VNX2 DOES support third-party key managers!  Gotta keep RSA competitors away though, that’s more important than doing right by clients, right?

OK. On to the choice of the SAS encrypting adapter vs. the encrypting hard drives.

Encrypting at the SAS layer has the great advantage of being able to encrypt to any drive available on the market.  That’s a valid architectural advantage from a cost and product choice perspective. That’s where the advantages stop, however.

It should seem obvious that having many devices working in parallel on a split-up data set is much more efficient than having 100% of the data load worked on at one point (I’ll call it the bottleneck!) in the data chain.  Based on the performance hit data supplied by the vendors, I’m probably correct.  EMC states a <5% performance hit using encryption- but has a caveat that “large block operations > 256KB” and high-throughput situations could result in higher performance degradation.  Netapp has no such performance restriction (its back end ops are always smaller), and the encryption jobs are being done by many, many spindles at a time, not a single ASIC (even if there are multiples, same point applies).  However, I could see how implementing an encrypting SAS adapter would be much easier to get to-market quickly, and the allure of encrypt-enabling all existing drives is strong.  Architecturally it’s way too risky to purposely introduce a single bottleneck that affects an entire system.

Coming to the end of this rant.  It just never ceases to amaze me that when you dig into the facts and the architecture, you’ll find that some manufacturers always think strategically, and others can’t stop thinking tactically.