One of my best friends in life started his professional career at Carnegie Mellon University, where for a while he worked (back in the 80’s) on the challenges surrounding parallel computing. Back then, it was a relatively esoteric field, in which one of the challenges was finding problems that lent themselves to parallel approaches, and another was trying to build programming models that made those problems tractable.
Spooling forward about 20 years (yipes), during the recent Boston Red Sox victory, MLB.com served 100,000,000 page views to 10,000,000 unique visitors. Each doing roughly the same thing. Talk about massive parallelism, it’s in front of our eyes. The internet itself has yielded the world’s largest parallel applications – from instant messaging, to bidding on beanie babies.
Now Sun has long been an investor in parallelism. First, we built systems capable of managing tremendous load (we’re honored to supply the infrastructure under America’s baseball addiction); and as importantly, we built an operating system that knew how to manage multiple threads of execution. Managing parallel “threads,” historically one per CPU, is key to scalability – simply put, the more work you’ve got to do, the more CPU’s you throw at the problem. And having an operating system that knows how to run efficiently across 100’s of cpu’s is a handy thing, at the core of Solaris’s reputation for “scalability.” Standing on the shoulders of giants, the Java platform was built with parallelism in mind, too.
Now oddly enough, despite the crises introduced when businesses have insufficient capacity, more often, businesses have too much capacity – average utilization in a datacenter is something like 15%. Which means most businesses waste enormous sums of the money on systems (not to mention the concomitant waste in power to keep dormant systems on, cooled and housed). Mainframes, historically, had very high utilization. Why? They were (and are) incredibly expensive, but they also have a feature called “logical partitioning” (LPARs) which allows big systems to be divided into many smaller mainframes. Until this year, no non-mainframe operating system offered logical partitioning. (Paraphrasing Gilder, “you waste what’s cheap.” Jonathan’s corollary, “Until you build your whole datacenter out of it.”) That is, until a few weeks ago.
One of the key features in Solaris 10 is just this – “containers” are logical partitions that allow a single computer to behave like an unlimited number of smaller systems, with little/no overhead. Reboot a partition in 3 seconds, keep disparate system stacks on the same computer, assign different IP addresses or passwords to each, treat them like different computers, and use them to consolidate all those otherwise 15% utilized machines – sky’s the limit (on any qualified hardware platform), and with it, customers can now drive utilization through the roof. With no new licensing charges. (And personally, I’m a fan of 3 second reboots.)
But back to baseball. One thing to recognize with businesses like MLB.com (and Google and Amazon and eBay) is that system level performance is now all about parallelism – defined as the art of behaving well when 3,000,000 baseball viewers (or searchers or shoppers or bidders) arrive to use your service. Sun, in fact, saw two years ago what Intel saw this year, that the gigahertz race was over. So we biased our entire system roadmap to “thread level paralellism,” and started designing systems with many, rather than one, thread per CPU. Most SPARC systems now ship with two threads of execution per socket (standard in all UltraSPARC IV systems). But that’s just a baby step toward true parallelism.
How parallel can we get? Niagara chips, built into our upcoming Ontario systems, will feature 8 cores, each with 4 parallel threads of execution – 8 times 4 yields a 32 way system – on a single chip. These systems will consume far less electricity and space than traditional system designs – and will be optimized for MLB.com style applications: thread sensitive, big data, throughput oriented apps. Moreover, they’ll drop our customers’ power bills and real estate costs – which may not sound like the class of problem today’s CIO cares about… until you actually talk to a CIO. Massive power and space bills are a big problem, and the physics of cooling a space heater is a more popular topic than you’d think. (btw, a dirty little secret – remember California’s power crisis a few year’s back? One of the leading suspects? Computers chewing up huge amounts of power, and producing heat, which required air conditioners, which chewed up even more power…)
As we scale out these systems, it’s perfectly reasonable to expect greater and greater levels of parallelism. And the good news is not only do Solaris and Java (and most of the Java Enterprise System) eat threads for lunch, but with logical partitioning, we can deploy multiple workloads on the same chip, driving massive improvements in productivity (of capital, power, real estate and system operators).
But let’s not stop there. Simultaneously, much the same inefficiencies described above have been plaguing the storage world. A few years back, “SSP’s,” or storage service providers, began aggregating storage requirements across very large customer sets, providing storage as a service. Most SSP’s found themselves stymied by the diversity of customer they were serving. Each customer, or application opportunity, posed differing performance requirements (speed vs. replication/redundancy vs. density, eg). This blew their utilization metrics. Before the advent of virtualization, SSP’s had to configure one storage system per customer. And that’s one of the reasons they failed – low utilization drove high fixed costs.
So that was the primary motivation behind the introduction of containers into our storage systems. The single biggest innovation in our 6920‘s is their ability to be divvied up into a herd of logical micro-systems, allowing many customers or application requirements to be aggregated onto one box, with each container presenting its own optimized settings/configurations. This drives consolidation and utilization – and when linked to Solaris, allows for each Solaris container to leverage a dedicated storage container. Again, driving not simply scale, but economy.
On top of all this, the same challenge has plagued the network world – diverse security requirements, and a desire to partition networks into functional or application domains, have driven a proliferation of “subnets” for applications, or departments. HR, Finance, Operations and Legal, for example, each require their own VLANs (virtual local area networks), the result of which is a gradual increase in partitioning, paired with a creeping inefficiency in network utilization – as the static allocation of subnets outpace anyone’s ability to manage them. (If you recall, prior to their downfall, Enron – one of the beneficiaries of California’s power crisis – was setting out to create a market for surplus network capacity – nice idea, turned out to be tough to execute).
This was the primary motivation behind Sun’s building containers in to our application switches – the devices that now sit in front of computing and storage racks, to help optimize performance of basic functions (network partitioning, security acceleration and load balancing, for example). The network itself can be divided into individual network containers, or virtual subnets, and programatically reprovisioned as loads change.
Meaning that a customer can now divide any Sun system into logical partitions or containers, each of which draws on or links with a logically partitioned slice of computing, storage and networking capacity. Which presents the market with an incredible opportunity to drive utilization up, and exit being one of the most inefficient (and environmentally wasteful – where are the protests?).
Which is a long way of saying the internet is the ultimate parallel computing application – millions, and billions, of people doing roughly the same thing, creating a massive opportunity for companies that solve the problems not only with scale, but with economy. A unit of computing has been detached from a CPU, to whatever a baseball fan wants at MLB.com. Or a bidder wants at eBay. Or a buyer at Amazon. Can you imagine how big a datacenter MLB.com would have to build if we were still in a mode of thinking each customer got their own CPU?
Just think about that power bill.
Some other thoughts:
What happens to software licensing in a virtualized world? What’s a CPU in a per-CPU license when the system you’re running has 32 independent threads? An anachronism in my book. Can you imagine if MLB.com charged by the CPU? That’s why all software from Sun, from the OS to the middleware, will be priced by the “socket” or employee. We believe the rest of the industry should move in the same direction.
Who’s the ultimate beneficiary of this mass virtualization? In the short run, customers who can now both recover dormant capacity and boost productivity (consolidate to Solaris 10, UltraSPARC IV, our 6920’s or our app switches – have yourself a “look at all this capital I freed up!” experience, and guarantee yourself a spot at your CFO’s summer party).
But the ultimate beneficiary may be the company that deploys all these systems – and can link together, as well as dynamically provision across, it in its entirety. The combinatorics are staggering – thousands of containers, against thousands of threads against the same orders of magnitude in storage and network partitions. That’s some serious scale. Requiring some serious economy in provisioning and operation.
So what business could possibly require or operate infrastructure at that scale? Sun’s Grid, of course. No reason to think we won’t be serving one of the largest markets in the world – driving utilization to be both prudent, and responsible.