Intel Storage – Storage Field Day 12

On the last day of Storage Field Day 12, we got to visit Intel Storage, who gave us some insight into what they were seeing and how they are driving the future of high performance storage.  Just prior to the Intel Storage session, we heard from SNIA’s Mark Carlson about trends and requirements mainly driven by the needs of the Hyperscalers, and Intel’s session detailed some very interesting contributions that line up very neatly with those needs.

One of the big takeaways from the SNIA session was that the industry is looking to deal with the issue of “Tail Latency events”, or as Jonathan Stern (a most engaging presenter) put it, “P99’s”.  Tail Latency occurs when a device returns data 2x-10x slower than normal for a given I/O request.  Surprisingly, SSD drives are 3 times more likely to have a tail latency event for a given I/O than spinning media.  Working the math out, that means that a Raid stripe of SSD drives has a 2.2% chance of experiencing tail latency- and the upper layers of the stack have to deal with that event by either waiting for that data or repairing/calculating that late data via parity.

Now one would think that when you’re dealing with latencies on NVM of 90-150 microseconds, even going to 5x keeps you within 1ms or so.  But what the industry (read: Hyperscalers who purchase HALF of all shipped storage bytes) is looking for is CONSISTENCY of latency- they want to provide service levels and be sure that their architectures can deliver rock-solid, stable performance characteristics.

Intel gave us a great deep dive of the Storage Performance Development Kit  (SPDK), which is an answer to getting much closer to that lower standard deviation of latency.  The main difference in their approach, which is the most interesting development (that could drive OTHER efficiencies in non-storage areas, IMO), is that they have found that isolating a CPU core for storage I/O in NVMe environments provided a MUCH better performance consistency, primarily because they eliminate the costs of context switching/polling .

The results they showed by using this approach were staggering.  By using 100% of ONE CPU core with their USER space driver, they were able to get 3.6 Million IOPs with 277ns overhead per I/O from the transport protocol.  Of course that’s added to the latency of the media, but that is a small fraction of what’s seen when using the regular Linux kernel-mode drivers that run across multiple CPUs.   We’re talking nearly linear scalability when you add additional NVMe SSDs using that same single core.

This is still relatively young, but the approach Intel is taking with the single-core, user space driver is already being seen in the marketplace (E8 Storage comes to mind, it’s unknown if they are using Intel’s SPDK or their own stuff).

Intel’s approach of stealing a dedicated core may sound somewhat backwards; however as Intel CPUs get packed with more and more cores, the cores start to become the cheap commodity, and the cost of stealing a core will start to go below the performance cost of context switching/polling (it may have already), as the media we’re working with now has become so responsive and performant that the storage doesn’t want to wait for the CPU anymore!

This is also consistent with a trend seen across more than a few of the new storage vendors out there, which is to bring the client into the equation with either software (like the SPDK) or a combination of hardware (like R-Nics) and software to help achieve both the high performance AND the latency consistency desired by the most demanding of storage consumers.

We may see this trend of dedicating cores become more popular across domains, as the CPU speeds aren’t improving but core counts are- and the hardware around the CPU becomes more performant and dense.  If you play THAT out long term, virtualized platform architectures such as VMWare that run hypervisors across all cores (usually) may get challenged by architectures that simply isolate and manage workloads on dedicated cores. It’s an interesting possibility.

By giving the SPDK away, Intel is sparking (another) storage revolution that storage startups are going to take advantage of quickly, and change the concept of what we consider “high performance” storage.

 

**NOTE: Article edited to correct Mark Carlson’s name. My apologies.

HPE buys Nimble for $1.09B – Trying to make sense of this

So per IDC, HPE was statistically tied for the #2 spot in the External Storage Array market in Q3 ’16, with $549M in sales vs $650M the prior year’s Q3.  That’s quite a downward trend. Included in that number are the multiple storage offerings that HP owns: 3PAR, LeftHand, its own arrays, etc.  

Today we find out that HPE has paid $1.09B for a company that has total revenues of around $500M, was losing money at a rate of around $10M per quarter, and had no appreciable market share in the external array market. Nimble made its initial bet on hybrid flash architecture, which became a problem as the market moved to all-flash. Nimble changed course and provided all-flash, but many other vendors were far ahead here. 

So what gives?  How can Nimble fit into a long-term strategy for HP?

Nimble isn’t really part of a hyperconverged play, so in the context of the recent Simplvity acquisition, this seems a parallel move. 

There’s InfoSight, which provides predictive analytics and ease of management for Nimble Arrays; perhaps HP sees a platform it can expand to its enterprise customers. But a BILLION dollars for that??

Nimble has a lot of small customers (over 10,000 at this point based on a recent press release), but 10,000 customers is a pittance compared to HP’s existing customer base across all customer sizes. 

In the short term, this acquisition will bump up HP into the #2 spot alone in external arrays, but not by much, and given the current trajectory of their external storage array revenue, it’s likely they won’t hold that spot for long. 

When you consider this acquisition from all the angles (and I’m sure I’m missing some and welcome the discussion), I struggle to make sense of this acquisition from HPE’s standpoint.  I only see cannibalization of existing storage business, disruption of the storage sales organization, and no added value to HPE’s overall storage offering.  Did HPE simply have $1B lying around with nothing better to do with it?  Guys, call ME next time.

Do GUIs really make things simpler?

I was at a technical breakout at a Netapp partner event, and heard a question from an SE (from another partner) that implied that most IT folks don’t have the chops for managing/deploying their environment without using GUIs and wizards.  He postulated that in order to further simplify concepts around complex infrastructure, most admins (especially in small-to-medium enterprises) need the wizard-driven spoon-feeding that GUIs provide so they can get back to the business of doing what they normally do…which is typically putting out all the fires they suffer from on a daily basis.

Ironically, you can trace many of the aforementioned daily fires directly to the use of those GUIs!  Notice I used the term “spoon-feeding” before, that was purposeful- GUIs used to deploy and configure resources (by engineers and not consumers) are just like SUGAR.  It’s sweet going down but you’re going to pay for that in many ways later.

Why is that?

When you use a GUI, Wizard, (or even a non-idempotent script) to deploy something like a VM, server, OS, APP, or even a VLAN, the next time you deploy one you are essentially attempting to rebuild a snowflake with no reference to the original.  This introduces all sorts of unpredictable inconsistencies into the environment, which ultimately results in faults.  Also, testing your GUI-based deployment on a test instance beforehand won’t guarantee you won’t have problems later, since you- the Gui/wizard user – are the most unpredictable component of the deployment, and can’t even guarantee that the test instance matched what you’ll end up with in production.

What’s even worse is that the whole idea of using a deployment wizard implies that you don’t need to know how your [insert tech here] fundamentally works in order to get it deployed.  THIS IS WRONG IS SO MANY WAYS.   How will you know if the [insert tech here] is optimized for your environment?  ..that the configuration the wizard chose won’t cause performance or availability issues down the road? ..that you won’t get boxed into a configuration that limits your flexibility later?   I mean, how can you call yourself an architect/engineer/administrator if you don’t actually LEARN the details of the system you’re going to architect/engineer/administer?

If you decide to use a tool like Puppet or Chef, with which you declare what your [insert tech here] should look like, you MUST by definition know how that technology fundamentally works, at least to some point, right?  You have make all of your choices up front, in the recipe file for instance, and this forces you to understand the available configuration options, etc, and also allows you to apply that configuration to a test instance FIRST, prior to production deployment.  Go try THAT with wizard-based deployment!

Of course, this is HARDER and it’s MORE WORK.  Up front, at least. It also requires research and knowledge.  You need to learn how the configuration/automation tools work.  It implies testing, which many don’t believe they have the time for (but somehow always find the time to fix stuff when it goes down).  So yes, the first time you do something this way, it WILL take longer.  The first time.  After that, the things you do most often become more and more trivial, and they’re done RIGHT, consistently.  The daily fires start turning into weekly and perhaps monthly fires.  Life starts getting more enjoyable.

And YES…this smacks of “DevOps”.  I’m not talking about this from a development or even an enterprise perspective though.  I’m talking about all of those small-medium sized businesses who have 1-2 IT folks who run the show, who are always running around with their hair on fire.  I’ve worked with those people for over 20 years and I feel awful at how many personal weekends and nights they lose because stuff is down.

I like to think of it this way: Would you drive on a suspension bridge that was architected and engineered by a gui-driven wizard, with the architect flying through screen by screen guessing if the default choice in every screen is “ok for now” and clicking “next”?  Or would you feel better knowing that an architect designed that bridge meticulously with great forethought using knowledge of bridges and physics in general, and the engineers thoughtfully built that bridge using the plans but also the knowledge of best practices applied to the specific environment? 

Don’t we as IT infrastructure architects and engineers owe it to our employers and our teammates to apply the same rigor to our work?

Willful Ignorance?

I recently had the honor of speaking to a large group of storage and network engineers on the topic of devOps. My segment was squeezed in between some other content that I’m pretty sure was much more important to them, like product announcements, demos, calls to action, etc.

Why do I think that?

Well, during the segment I asked the crowd a question- “How many of you have read The Phoenix Project by Gene Kim [et al]?”

I counted maybe 15 hands out of…a LOT more. I was flabbergasted.

If you haven’t read this book, it’s highly likely that you do not understand your customers’ problems, and therefore do not understand your customers. One day, you’re going to walk into your biggest customers and they’re going to be very sorry to tell you that you’re not needed anymore, as your offerings (and perhaps sales model) don’t align with their new strategy.

The worst part is, you probably think you’re pretty darn good at this IT stuff. You know your tech (for years now!), you’ve got your speeds and feeds down pat, you have gobs of expertise in this technology or that. Your customers (almost all of them IT folks) come to you with their problems. Perhaps you even socialize with many of these people, and consider these relationships completely safe.

It’s not that your customers won’t need the products and services you currently offer. It’s just that the way that they CONSUME these will require an understanding (on your part) of their new (or soon-to-be new) models and processes that will drive their “accelerating acceleration”, and yes, I’m talking devOps here. Your services need to be updated to align the products with these new ways, which means making automation, scripting, and infrastucture-as-code major core competencies. Show them how your products and services assist or enable their transformative efforts, or somebody else will. “Somebody else” could be another department (app dev, for instance) that will transform their use of technology outside of IT and REALLY put the screws to your offering. If the products you specialize in can’t align with this philosophy, it’s time to focus on obtaining new expertise in the technologies that will replace them.

Just yesterday I visited with two customers, both of which had “Shadow IT” instances turn into permanent business transformations, as the “Shadow IT” folks were able to deliver value to the business within days, where the IT ops folks were taking weeks into months to deliver the same services. Times have changed, people.

I wouldn’t even start down the road of learning automation or Infrastructure-as-Code until you’ve READ THIS BOOK. (There are many others, this one will be the most entertaining and therefore most likely to be completed). You need to know WHY all of this is important, remind yourself WHY we do what we do as IT professionals, and understand the nature of your customers’ desires to transform in order to steer your own efforts, both personal and organizational, in the right direction.

Netapp SolidFire: FlashFoward notes

I had the privilege of attending the Netapp SolidFire Analyst day last Thursday- and rather than go through what the company told us (which was all great stuff), what I heard from SolidFire’s customers while there was probably more relevant and important.

I won’t repeat what’s been detailed elsewhere about the new capacity licensing model from Netapp SolidFire, which breaks the storage appliance model by decoupling the software license from the hardware it runs on.  This new model, called “FlashForward”, allows for a flexibility and a return on investment previously unavailable in the enterprise storage market.

The service provider customers that participated in the breakouts unanimously agreed that this new licensing model was going to be a huge win for them.  The most striking point was one service provider who had different depreciation schedules for software versus hardware –  something that couldn’t be taken advantage of with the appliance model.  Now, since the FlashForward program allows licenses to be moved between hardware instances, the software can now be depreciated over a longer timeframe (in this SP’s case it was 7 years).  Hardware refreshes now come at a much lower cost in year 3-4.  This all results in the ability to provision resources to tenants with a lower incremental cost per resource.

This could also have a major impact in the ability of companies to finance/lease SolidFire solutions- If you’re financing a given set of software over 6-7 years, obviously the monthly bill for that will be less than if you’re amortizing it over 3 years.   Hardware remains at a 36 month lease with its typical residual.  In a given year, that has the possibility of reducing cash out considerably.

Enterprise customers weren’t quite as giddy about FlashFoward as they were about the SolidFire technology itself; however the folks there were all IT, not finance.  Service Provider folks tend to be more focused on the economics of resource delivery.  The Enterprise IT folks were all about the “set-it and forget-it” benefits of their SolidFire implementations, with one customer stating that they only had to call support once in two years, for an event that wasn’t even SolidFire-related.  Certainly, we had happy customers at the Analyst event, but their stories were all those of the challenges of choosing a smaller storage vendor (at the time), against major industry headwinds and having to justify that decision with full proof-of-concepts.  Impressive stuff.

Of course SolidFire has made its name in the Service Provider market; their embrace of automation technology and the devops philosophy is recognized as leading the market.  This is precisely why we should be keeping a very close eye on these folks, as Enterprise IT looks to become its own Service Provider, and automate its on-premises resources in the same manner in which it automates its cloud resources.  Given this automation advantage and the acquisitional flexibility now offered,  SolidFire is going to align very well with enterprises that have historically implemented the straight SAN storage appliance model but are looking to transform and modernize.

Solidfire Analyst Day – 6/2/16 – Live Blog

  • (7:52AM MDT) Exciting day in Boulder, CO, where a bunch of folks from Solidfire (and Netapp) are going to be going through the state of their business, and announce all sorts of stuff.  So, instead of trying to fit everything into tweets, I figured I’d do my best live blog impersonation! Stay here for updates, keep refreshing!

IMG_2238

8:07am:  Here we go!  John Rollason, SolidFire Marketing director, takes (and sets the stage).

8:12am: #BringontheFuture is the hashtag of the day.  Coming up is the keynote from Solidfire founder Dave Wright.  This will be simulcast.  I already see tweets about the new purchasing model, yet nobody’s ANNOUNCED anything.  Oooh, Jeremiah Dooley’s going to do FIVE demos later.  AND, George Kurian hits the stage at 5:15.

IMG_2239

8:24am:  John Rollason having a heck of a time trying to pronounce “Chautauqua”, which is where Solidfire will be taking folks for hiking tomorrow. Also- phones on “stun” please.


10:31am: Dave Wright in on stage. IMG_2240 2

“One storage product cannot cover all the use cases for Flash in the datacenter”.

  • EF- SPEED Demon. (One person sitting with me called it the “Ricky Bobby” flash storage.
  • AFF- DATA SERVICES.
  • SOLIDFIRE- SCALE.

This portfolio is now a $700M/year run rate.  If you think Netapp is a laggard, you haven’t been paying attention!!  The Analyst view of the AFA Market WAY over-estimated the impact of hybrid vs all-flash.

Per dave: “Netapp KILLING IT with all-flash adoption.  Way Way ahead of analyst projections.”  33% of bookings are now all-flash.  The adoption curve is so far ahead of what analysts projects, one wonders what they were thinking.

HA! graphic– headstone, RIP HYBRID STORAGE.

ANNOUNCEMENT : One platform, Multiple Consumption Models!

ANNOUNCEMENT: Element OS 9: FLUORINE
-VVOLS done right with per-vm QoS, New UI, VRF Multi-Tenant networking, 3x FC Performance and increase scale with 4 FC nodes now.
Functional AND sexy.

ANNOUNCEMENT: FULL FLEXPOD INTEGRATION!  Converged Infrastructure – VMWare, OpenStack, one of the most compelling converged infrastructure offerings in the market.

ANNOUNCEMENT: SF19210 Appliance- 2x perf, 2x capacity, 30% lower $/GB, >1PB Effective Capacity & 1.5M IOPS in 15U.

“Appliance licensing is too rigid for advanced enterprises, agile datacenters”.
cost all up front, can’t be transferred, data reduction impose uncertainty/risk around actual cost of storage

ANNOUNCEMENT: SF FlashForward Storage

License Element OS SW on an Enterprise-wide basis.  Based on Max Prov capacity
Purchase SF-cert’d HW nodes on a pass-thru cost basis

  • Flexible – can purchase SW/HW separately – more efficient spend
  • Efficient – No need to re-purchase when replacing, upgrading, consolidating – no stranded capacity
  • Predictable – no sw cost penalty from low-efficiency workloads.
  • Scalable – usage-based discount scheme, pass-through HW pricing – no unness sw or support costs increases as footprint expands

9:30am: Brendan Howe takes the stage to give a business update.

  • Runs Emerging Products group @ Netapp.
  • Reaching “extended customer communities”
    • Run
    • Grow
    • Transform
  • Scale  (SF) vs. Data Services (AFF) vs. Speed (EF) (same story as before)

Opinion – perhaps this is TOO discrete of a model for positioning. Certainly SF is perfectly (perhaps MORE) appropriate for certain classic datacenter storage use cases than AFF at times.  Anyone who sticks to this religiously is missing the point IMO.

  • Operational Integration –
    • Structured SF Business Unit
    • GTM Specialists Team under James Whitmore
    • Broad scale & pathway enablement
      • global scale , “all-in”
    • Fully integrated shared services
    • SolidFire office of the CTO (headed by CTO Val Bercovici)
    • Must protect this investment

9:47am : James Whitmore takes the stageIMG_2241

 

  • VP of Sales & Marketing
  • 46% YoY bookings growth
  • deal size >$240k avg
  • 57% Net New Account acquisition YoY
  • 218 Node largest single customer purchase (!!!!)
  • >75% bookings through channel partners
  • >30 countries
  • 77% in the Americas
  • Comcast, Walmart, Target, AT&T, TWC, Century Link
  • 2 segments
    • Transforming Digitals
      • Nike, BNP Paribas
    • Native Digitals
      • Ebay
      • Kinx
    • Cloud/Hosting Providers
      • SaaS
      • Cloud/Hosting
      • Cable/Telco
  • >80% increase in Enterprise Net-new account
    • wins across every major vertical/use case
      • finance, healthcase, SaaS, Energy, VMW/HyperV, Ora/SQL, VDI, Openstack
    • customers truly committed to transforming their datacenters
    • dominance in service provider market
    • 20% YoY increase in # of SP’s, but a 3x+ increase in capacity!
    • Very strong repeat business
  • Channel Integration
    • merge programs
    • enable new-to-SF partners
    • Leverage global distribution network
  • SF Specialist Team
    • Invest to accelerate growth
    • Transfer product capability across org
    • Direct specialists to high-growth segments and products

10:16am – Dave Cahill takes the stage.

  • Talking about FlashForward program.
  • Senior Director, Product & Strategy, Netapp SolidFire
  • “Flexible and Fair”
  • How it works
    • Buy Element OS Cap License
    • Choose a certified SF HW platform
    • Order through SF or preferred channel partner
  • Flexible, Efficient, Predictable, Scalable
  • Tiered pricing model
  • FullSizeRender
  • Top left blue is $1/GB, 20% support based on graph.  Towards right side of graph, at 25% support.
  • OPINION: Even for customers that will purchase capacity licenses 
    and hardware at equal paces, the cumulative tiered discount model 
    ensures that customers who grow are rewarded with lower incremental
    cost per TB.  This is similar to Commvault's recent licensing model
    change.
  • The analysts are really grilling Dave on the model…question from the field about overlapping licenses during migration…question on ratio of HW/SW costs…
  • If you over-estimate the efficiency ratio- only buying more hardware- NOT more appliances.
  • Snapshots are NOT included in provisioned capacity license (good)- but if you want to hold extended retention, just more hardware.

IMPORTANT: THIN-PROVISIONING WILL CONSUME CAPACITY LICENSE at the provisioned amount, whether actually consumed or now. 

QUESTION: Is capacity  license consumed PRE-efficiency or POST-efficiency?

10:50am- Dan Berg & Jeremiah Dooley going to do geek stuff!

IMG_2245IMG_2246

  • Solidfire Products
    • ElementOS
      • Fluorine
        • VVols, VASA2 built into the ElementOS SW – on a fully HA aware cluster!
        • New UI
        • Greater FC perf and scale
        • Tagged default networks support
        • VRF VLAN Support
    • Platforms
      • SF19210 – Highest density & performance
        • Sub-millisecond latency – 100k IOPS
        • 40-80TB effective cap
      • Active IQ
        • New- alerting/mgmt framework
        • predictive cluster growth/capacity forecasting
        • enhanced data granularity
        • significant growth in hist data retention and ingest scale
        • >80% leverage Active IQ
    • Ecosystem Products
      • Programmatic- powershell & sdks
      • open source – docker/openstack/cloudstack
      • Data Protection – VSS, Backup Apps, Snapshot offloading
      • VMW- VCP, vRealize Suite

11:00am – Jeremiah Dooley – Principal Flash Architect

IMG_2248VMWare Virtual Volumes and
Why Architecture Matters

VMFS – the best and worst thing to happen to storage

Nice slide, RIP VMFS.

VVol adoption is “slow” – takes time for partners/vendors to get their “goodies” into there

Since customers are virtualizing critical apps/DBs now, new feature/release adoption is a gen behind typically now

Getting customers in VVOLs

build policies throgh VASA2 like QOS
MIN/MAX/BURST
Move VM form vmfs into VVOLS – 8-line powershell script- ID/migrate/apply

Increase size of VVOLs- big difference from VMFS

GOAL: Do all the provisioning/remediation/growth without talking to the storage team.


Why Openstack?

  • Agility – elasticity, instant provisioning, meet customer demand
  • capex to opex model
  • cluster deployment within an hour, given HW is ready
  • Prod Development- self provisioning/ self service- dev can create/destroy environments
  • Why not? lack of business cases (another guy) – so VMWare good fit for those
  • Docker – openshift – docker/kubernets/docker storm

Imagine moving an 8-node cluster from one rack to another with NO DOWNTIME


12:00 – LUNCH, back at 1pm.


1:00 – Joel Reich takes the stage.

IMG_2250

Joel is giving the ONTAP 9 feature set- including the easier application-specific deployments and high capacity flash.

ONTAP Cloud- many of Netapp’s larger customers using this for test/dev, some for Prod and they fail back on-prem.

ONTAP Select – Flexible capacity-based license, on white box equipment

Question: When the heck did White Box ever save anybody ANYthing
in the long run?

ANNOUNCEMENT: ONTAP Select with SolidFire Backend coming!


5:03pm- George Kurian – Closing Keynote

Enabling the Data Powered Digital Future

– Market Outlook- Enterprise IT spending has SLOWED –

ZING: EMC/Dell is a “Tweener” – not as vertically integrated as Oracle, they don’t do the whole stack

  • Kurian- Netapp doesn’t need to be “first to market”.
  • Q1 Market Share in AFA- Netapp 23%!! (IDC)
  • New Netapp: Broad portfolio that addresses broad customer requirements.
  • Netapp – “We protect and manage the world’s data”
  • FY16 $5.5B, Cash $5.3B, total assets $10.0B
  • 85% of shipment are clustered systems
  • $1B free cash flow PER YEAR.  Talk about stability.
  • Pre-IPO,IPO,POst-IPO stories will be written in red ink
  • 1) Pivot to Accelerate growth 2) Improve Productivity
  • Work effectively to deliver results – Priorities, Integrated execution, Follow through
  • Renew the Organization – Leadership, High Perf Culture, Talent
  • 75% of CEOs will say biggest concern is not strategy but execution
  • Incubate – pre-market (i.e. cloud), complete freedom of thinking, fail fast
    • “prodOps”
  • Grow – focus on biggest markets with biggest opp, inspect at district level, is the channel ready
  • Globalize – mainstream GTM through every pathway possible
  • Harvest – prepare for end of life, use revs to fund new businesses
  • $700M+ AFA Annual Run Rate
  • 80% YoY unit growth cDOT, 30% e-series YoY Unit growth
  • 185% AFA YoY rev growth
  • Transforming Netapp
    • remove clutter 20 years of unmanaged growth
    • focus on the best ideas
    • “velocity offerings” – preconfigured offerings
    • fine tune GTM model
    • Shared services
  • Close
    • Fundamental change
    • Good progress and accelerating momentum – more work needed
    • Strong foundation
    • Uniquely positioned to help clients navigate the changing landscape.

That’s it from Boulder, CO!  Off to one of the three dinners that Netapp Solidfire have set up for the crowd tonight!

 

 

 

 

 

 

 

X.AI – Amy Ingram is now one of my favorite “people”

So. Who the heck is Amy Ingram?   

She’s awesome. She’s organized. She really knows me (mostly because I told her all about me).

Most of all, she’s….not REAL. Don’t hold that against her, it certainly doesn’t inhibit her productivity.

Here’s the background. I was given the opportunity to beta test x.ai’s offering last week. x.ai leverages artificial intelligence (thus, the ai) to help you organize your calendar in a much more efficient way. Like many others, I spend WAY too much time trying to coordinate meetings with people, as we all have so many meetings and calls on our calendars, finding common free time to conduct business (or whatever) usually involves an indeterminate volley of emails, phone calls, and texts- and sometimes even the unfortunate gaze at the calendar on the phone while driving. Not good.

X.ai presents Amy Ingram (initials…ai) as what READS like a real person via email. She’s actually an intelligent agent that has access to my calendar (google for now, others to come), so she can tell when I’m busy. Since I run on Office365, the challenge was to get Amy the required visibility to my calendar, which I achieved by one-way syncing my O365 calendar with a newly-created Google Calendar (using a separate tool I will NOT endorse, it gets the job done but yikes). More importantly, the agent knows what buffers I need around calls and meetings, where I live, where my offices are, and where I prefer to meet for coffee or lunch.

When I want to get a meeting or call set up, I simply email Amy and the person I want to set the meeting with, and ask for it. In plain english. I don’t give a time, I just say “Amy, please find 30 minutes for Dennis and I to talk via phone”. I can say this in many different ways, Amy has understood every permutation I’ve thrown at her. And that’s IT. Amy will email Dennis and give him three different possible times, and he can respond in plain english as well, Amy understands the context. For instance, if Amy provided three different days to Dennis, he can just reply with “I’ll take Tuesday.” Amy will know WHICH Tuesday and the times she said I could have the call. After that, we both get the invite from Amy. If Amy were human, the emails, responses, and results would be the same.

What’s even better is when I need to set up a meeting with 3 or more people- Amy does ALL of the work for me. No frustrating discussions of when he’s free, when she’s free, and “oh someone booked me so now I’m not free”… Amy handles ALL of the back-forth-back-forth again. I can ask for a status update any time I like, and she’ll respond with what she’s working on, who she’s waiting for responses from, and what she’s gotten done recently.

If I cancel a meeting, she’ll send an email to the invitees telling them that she “wanted to let them know that the meeting needed to be cancelled”. That’s more than I usually get when someone cancels on me.

I’ve demonstrated this to a few of my colleagues, with one universal response: Envy. So, that’s sort of a mission accomplished, right?

I’ve noticed that sometimes it does take some time for Amy to get things done, but I attribute that to the “beta” tag on the solution. So far, Amy has been 100% accurate and all of the people that have received emails from Amy working out schedules have responded in ways Amy understands (which is basically English!).

This technology has LEGS, even beyond the individual executive looking to optimize their time. I can see this being expanded for use by project managers who have to cat-herd 10 people onto a project con call- just IMAGINE the cost savings across the board for a PMO. PMs in my company spend hours coordinating calls on a daily basis. Telemarketers who set up meetings can make more calls and reach more people if they no longer have to coordinate the dates and times.

Further, as the Intelligent Agent evolves, it’s not hard to imagine Amy making the lunch reservations for your meeting as well, buying the tickets for the sports event, booking your flight for that conference, or even adding information into your expense management application since she has all the meeting data already. The sky is the limit with this. AND, it already WORKS.

Kudos to the x.ai team. I’m truly looking forward to further advancements with this technology. In the meantime, I’m going to enjoy this beta as long as I can!

Keep an eye on these guys, they’ve got something very right here.

NPS-as-a-Service: The perfect middle ground for the DataFabric vision?

red8Everyone pondering the use of hyperscalar cloud for devops deals with one major issue- how do get copies of the appropriate datasets in a place to take advantage of the rapid, automated provisioning available in cloud environments? For datasets of any size, copy-on-demand methodologies are too slow for the expectations set when speaking of devops, which imply a more “point-click-and-provision” capability.

NetApp has previously provided two answers to this problem: Cloud ONTAP and NetApp Private Storage.

Cloud OnTAP represents an on-demand Clustered ONTAP instance within the Amazon EC2/EBS (and now Azure), that you can spin-up and spin-down. This is really handy, but can get expensive if you want the data to be constantly updated with data from on-prem storage, since the cloud instance must always be up and consuming resources. Further, some datasets could have custody requirements that prevent them from physically residing on public cloud infrastructures.

NetApp Private Storage addresses both of these problems. It takes customer purchased NetApp storage that is located at a datacenter physically close to the hyperscaler, and connects that storage at the lowest possible latency to the hyperscaler’s compute layer. The datasets remain on customer-owned equipment, and the benefits of elastic compute can be enjoyed. Of course, the obvious downside to this is the requirement of capital expenditure- the customer must purchase the storage (or at least lease it). Also, the customer must maintain contracts with the co-location site housing the storage, and do all the work to maintain the various connections from the AWS VPC through the Direct Connect, and from the customer datacenter to the co-location site. It’s a lot of moving parts to manage. Further- there’s no way to get started “small”; it’s all or nothing.

But wait! There’s a NEW OFFERING that solves ALL of these problems, and it’s called “NPSaaS”, or NetApp Private Storage as-a-Service.

npsaas2

This offering, currently only available via Faction partners (but keep your eyes on this space) will provide all of the goodness of Netapp Private Storage, without most of the work and NONE of the capital expenditures. It’s not elastic per-se, but it is easily orderable and consumable in 1TB chunks, and in either yearly or month-to-month terms. Each customer gets their own Storage Virtual Machine (SVM), providing completely secure tenancy. It can provide a SnapMirror/SnapVault landing spot for datasets in your on-prem storage, ready to be cloned and connected to your EC2/Azure compute resources at a moment’s notice. You can of course simply provide empty storage at will to your cloud resources for production apps as well.

When you consume storage, you’ll be consuming chunks of 100mb/s throughput from storage to compute as well. You can purchase more throughput to compute if you want- not IOPS. You’ll get 50mb/s internet throughput as well. All network options are on the table of course, you can purchase as much as you’d like, both from storage->compute, and from storage->internet (or MPLS/Point-to-Point drop).

How is this achieved?

Faction has teamed up with NetApp to provide NetApp physical storage resources at the datacenters close to the hyperscalers, ready for use usually within 3-5 days of completion of paperwork, and it will be very simple to order. This means no additional co-lo contracts, no storage to buy, all you need to do is order your Amazon Direct Connect and VPC, and provide that information to Faction and you’ll be set in a few days to use your first 1TB of storage.

Once set up, you’ll be able to use NFS and iSCSI protocols at will. There are currently some limitations that prevent use of CIFS, but future offerings will provide that functionality as well.

From a storage performance perspective, we’re currently looking at FlashPool-accelerated SATA here. Again, future offerings may provide other options here, as well as dedicated hardware should the requirements dictate. But not yet. This level of storage performance provides the best $/GB/IOPS bang for the buck for the majority of storage IO needs for this use case; but if you’re looking for <1ms multi-100k IOPS here, “standard” NPS is what you’re looking for.

Faction also provides a SnapMirror/SnapVault-based backup into their own NetApp-based cloud environment, at additional charge. You could also purchase storage in multiple datacenters, SnapMirroring between them for regional redundancy to match the compute redundancy you enjoy from either AWS or Azure.

Note to remember: This offering is NOT a virtualized DR platform. You can’t take your VMFS or NFS datastores and replicate them into this storage with the idea of bringing them up in Amazon or Azure- that won’t work. So from a replication perspective, this would be more for devops capabilities to provide cloned datasets to your cloud VM’s.

Also, the management of the storage is almost completely performed by Faction on a service-ticket basis. This means (for now) that you won’t be messing with SnapMirror schedules, SnapShots, etc, which does put a little damper on the devops automation for cloning and attaching datasets, for instance. I’m sure this will be temporary as they iron out the wrinkles.

One other thing from a storage/volume provisioning perspective. Since we’re dealing with an on-demand Clustered OnTAP SVM instance here, Faction needs to carefully manage the ratio of FlexVols and capacity as the number of available FlexVols in a given cluster is not infinite. So you will originally get one volume, and you can’t get a second one until you consume/purchase 5TB for the first one, and so on. So if you want two volumes of 2TB each, you’ll need purchase 7TB (5TB for #1, and 2TB for #2). However, no reason you can’t have two 2TB LUNs in one volume- so this isn’t as big a constraint as it may at first seem, you just need to know this up front and design for it.

Drawbacks aside, this consumption model addresses the costs of a 100% utilized/persistent Cloud OnTAP instance, as well as the capital and contractual requirements of Netapp Private Storage. It’s certainly worth a look.

Information-235: Using decay models in a data infrastructure context

Information, like enriched uranium, doesn’t exist naturally.  It is either harnessed or created by the fusion of three elements:

  • Data
  • People
  • Processes

This is important to understand – many confuse data with information.  Data by itself is meaningless, even if that data is created by the combination of other data sources, or rolled up into summarized data sets.  The spark of information, or knowledge, occurs when a PERSON applies his/her personal analysis (process) to enable a timely decision or action. 

Once that information has been created and used, its importance and relevance to the informed immediately begins to degrade – at a rate that is variable depending on the type of information it is. 

This is not a new idea- there are more than a few academic and analytical papers that have been published that discuss models for predicting the rates of decay of information, and so I can thankfully leave the math for that to the mathematicians.   However, the context of these academic papers is data analytics, and how to measure the reliability or relevance of datasets in creating new actionable information.

I believe that this decay construct can become a most valuable tool for the infrastructure architect as well, if we extend the decay metaphor a bit further.  

Just as when you enrich a radioactive isotope, it transforms into something ELSE, when a radioactive isotope DECAYS, it decays into less radioactive ones, and sometimes even those isotopes decay further in a multi-step decay chain.   We can see that same behavior with information and its related data.

Let’s take an example of a retail transaction.  That transaction’s information is most important at the point of sale- there are hundreds of data points that gets fused together here, including credit card info/authorization, product SKUs, pricing, employee record, just think of everything you see on a receipt and multiply by x.  That information is used during that day to figure the day’s sales, that week to figure some salesperson’s commission perhaps, that month to figure out how to manage inventory.

A subset of that transaction’s information will get used in combination with other transactions to create summarized information for the month, quarter, and year.  The month and quarterly data will be discarded after a few years as well, leaving the yearly summaries.

 So in time, that information that was so important and actionable day one becomes nothing but a faint memory, yet the FULL DATA SET on that transaction is likely going to be stored, in its entirety, SOMEWHERE.   That somewhere, of course, we’ll call our Data Yucca Mountain.

What does this mean to the infrastructure architect?

If one can understand the data sets that create information, and understand the sets of information that get created from the datasets as well as “depleted” information (think data warehouses and analytics), then one should be able to construct the math to not only design the proper places for data to sit given a specific half-life, but to SIZE them correctly.

This model also gives the architect the angle to ask questions of the business users of information (and data), which will give him/her the “big picture” that allows them to align infrastructure with the true direction and operations of the business.  Too often, infrastructure is run in a ‘generic’ way, and storage tiers are built by default rather than by design. 

Building this model will take quite a bit of work, but it will go a long way towards ensuring alignment between the IT Infrastructure (or cloud) group and the business, and provide a much clearer ROI picture in the process.