Skip to content

Virtualization: Solution or Problem?

September 27, 2009

Is virtualization solution to a problem or part of the problem?

Christofer Hoff ignited the create spark Virtual Machines are the Problem, Not the Solution.

In my view and experience, Virtualization is part of the problem as well as part of the solution. While automation is the key in fulfilling end-to-end service delivery, virtualization is a necessary technology. However, current architectural style of service composition, delivery, and management is mired with problems, workarounds, and band-aids which makes the SLA driven end-to-end service delivery just a promise not the fulfillment. We should stop dishing out nodes to the development. Should stop pushing ACLs into switches. We should stop accessing OS primitives from applications. We should stop writing communication patterns into applications. A well defined abstraction and framework on top of Virtualization is essential to make this happen. We can’t ignore the change, configuration, and security management. Simply put it, push-button delivery of services into Cloud securely, reliably, and rapidly. As Christofer Hoff (@beaker) suggested on his blog Rational Survivability, JEOS is first step in that direction.

Let me share my view on why virtualization is part of problem first and then explain why it is also important for End-to-End service delivery.

Why it is Part Problem?

“Geometric complexity” of systems is a (if not the) major contributor to the costs and stability issues we face in our production environments today. In this context, complexity is introduced by the heterogeneity and variations of “OS” needs per application and underlying components (like databases, network, and security etc). These unmanageable or incomprehensible numbers of variations of the Operating Environment makes it hard to understand and optimize our compute infrastructure. We continue to invest our scarce resources to keep this junk alive and fresh all the time. More importantly, 70% of service outages today is caused by configuration or patching errors.
Christofer Hoff (@beaker) puts it very well,

“there’s a bloated, parasitic resource-gobbling cancer inside every VM”.

I was hopeful and optimistic that would change the way applications designed and delivered. Rich application frameworks like J2EE, Spring, Ruby etc evolved but Operating Environment evolved into one big, monolithic, generalized OS making it impossible to track what is needed and what is not. Adding to this brew, mind boggling number of open sources libraries and tools crept into OS. Though Virtualization provided an opportunity to help us correct these sins but in the disguise of virtualization we started to commit more sins. Sadly, instead of wiping out the cancer bits in the operating environment, all the junk packaged into VMs.

Christofer Hoff (@beaker) raised very thought provoking and stimulating question:

“if we didn’t have resource-inefficient operating systems, handicapped applications that were incestuously hooked to them, and tons of legacy networking stuff to deal with that unholy affinity, imagine the fun we could have. Imagine how and flexible we could become”.

This is very true. We have too much of baggage and junk inside our operating environment. That has to change. It is not the question of VMWARE, XEN, Parallels or Linux, Open Solaris or FreeBSD. We need paradigm shift in the way we architect and deliver “services”.

Sam Johnston (@samj ) pointed out,

“ I agree completely that the OS is like a cancer that sucks energy(e.g., resources, cycles), needs constant treatment(e.g. patches, updates, upgrades) and poses significant risk of death(e.g. catastrophic failure) to any application it hosts”. Yes, Sam is correct in his characterization or assertion of “Malignant OS”.

Now turn our chapter to why virtualization is important

@JSchroedl @AndiMann @sureddy Sounds like we’re all in virtual agreement: Not just virtual servers, or even virtual systems, but “Services” end-to-end.

End to End Service Delivery: My sense of virtualization is that it provides an abstraction to absorb all low-level variations, exposing a much simpler, homogeneous environment. While this is not sufficient to help us deliver the automation needed for End to End Service delivery, it is a necessary technology. Applications/Services won’t be exposed to the variations in our operating environment; instead, they will be exposed to a service runtime platform (call it “container” for lack of a better word) with uniform behavioral characteristics and interfaces (please note that “container” is not VM, it is much higher level abstraction that orchestrates hypervisors and operating environments isolating all intricacies of virtualization and operations management etc). We won’t need to qualify an innumerable combination of hardware, OS’s, and software stacks. Instead, the Container layer will be the point of qualification on both sides: each new variation of hardware will be qualified against a single Container layer, and all software will be qualified (quite literally, providing a fast lane change mechanisms development, test, staging and production (Continuous Integration & Continuous Deployment) against that same Container layer. This is really big deal. It helps us to innovate and roll out new services much faster than before. Virtualization plays important role in fulfilling the end-to-end service delivery.
Christofer Hoff(@beaker) pointed out,

“VMs have allowed us to take the first steps towards defining, compartmentalizing, and isolating some pretty nasty problems anchored on the sins of our fathers, but they don’t do a damned thing to fix them. VMs have certainly allowed us to(literally) think out-side the box about how we characterize workloads and have enabled us to begin talking about how we make them somewhat mobile, portable, interoperable, easy to describe, inventory, and in some cases more secure. Cool.”

Configurastions vs. Customizations: Virtualization also absorbs variations in the configurations of physical machines. With virtualization, applications can be written around their own, long-lasting “sweet spots” of services configurations that are synthesized and maintained at the container.

Homogeneity: The homogeneity afforded by virtualization extends to the entire software-development lifecycle. By using a uniform, virtualized serving infrastructure throughout the entire process, from development, through QA, all the way to deployment, we can significantly accelerate innovation and eliminate complexities, and reduce or eliminate incidences that inevitably arise from when the dev and QA environments differ from production.

Mobility: Software mobility to easily move software from one machine to another will greatly relax our SLAs for break-fix (because the software from a broken node can automatically be brought up on a working node), and that in turn reduces the need to physically move machines (because we can move the software instead of moving the machines).

Security Forensics: When an app host is to be decommissioned, virtualization presents the opportunity to archive the state of the host for security forensics, and to securely wipe the data from the decommissioned host using a simple, secure file-wipe rather than a specialized, hard-to-verify bootstrap process. In sum, VMMs provide a uniform, reliable, and performant API from which we can drive automation of the entire host life cycle.

Horizontal Scalability: Virtualization drives another very interesting and compelling architectural paradigm shift. In the world of SOA and global serving with unpredictable workload, we are better off running service tier(my view of tier is load balanced cluster of elastic nodes) across a larger number of smaller nodes, versus a smaller number of larger nodes. Large number of smaller nodes provides cost as well as horizontal scalability advantages. In addition, with a larger number of smaller nodes, when a node goes out, the remaining nodes can more easily absorb the spike in workload that results and new nodes can added or removed in response to workloads.

Eliminate Complex Parallelism: My experience with multi-processing systems(SMP) has shown that effectively scaling software beyond a few cores requires specialized design and programming skills to avoid contention and other bottlenecks to parallelism. Throwing more cores at our software does not improve performance. It is hard to build these specialized skills to develop well-tuned SMP and indeed becoming a great inhibitor to innovation in building scalable services. By slicing large physical servers into smaller, virtual machines we can deliver more value from our investment.

Cloud and Virtualization

@JSchroedl: PRT @AndiMann: HV = no more than hammers PRT @sureddy: Virt servers don’t matter.Cloud is a promise “Service” is what counts

Cloud is a promise and Service is the fulfillment. The goal of the cloud is to introduce an orders-of-magnitude increase in the amount of automation in IT environment, and to leverage that automation to introduce an orders-of-magnitude reduction in our time-to-respond. If a machine goes down (I should stop referring to machines any more – instead I should start emphasizing SLAs), automatically move its workload to a replacement—within seconds. If load on a service spikes or SLAs deviate from the expected mean, auto-magically increase the capacity of that service—again, within seconds.

Hypervisors (virtualization) are as necessary as hammers but not sufficient. What is needed is “End-to-End Service delivery. There is no doubt in my mind that IT is strategic to the business and if properly aligned with business goals, IT can indeed create huge value. Automation and End-to-End service delivery are key drivers for transforming current IT to more agile and responsive IT.

Physical machines do not provide this level of automation. Neither the bloated VMs containing the cancerous OS images. What we need a clean separation of Base Operating system (uniform across cloud), Platform specific components/bundles, and then application components/configurations. While it is impossible to rip and replace existing IT infrastructure, this layered approach would help us to gradually move toward more agile service delivery environment.

8 Comments leave one →
  1. September 28, 2009 8:54 am

    So great post, what I really see in this is a tension between rock solid service definition, where the service is so well defined you could almost write an ASIC for it and just deliver deliver deliver…but there is a fly in the ointment, and its framework, tools, dependency and library drift–some of it bloat, but some of it required for application advancement. For instance Twitter originally wanted to be on Solaris, but they had to move because of tools/library availabilities elsewhere.

    So long as developer productivity is tied into the changing world of

    “Adding to this brew, mind boggling number of open sources libraries and tools crept into OS.”

    I see:

    “Sadly, instead of wiping out the cancer bits in the operating environment, all the junk packaged into VMs.”

    as being persistent? Maybe JEOS, and brilliant just in time ones such as Randy B suggested can clean up a lot of the bloat, but I do not believe JEOS can clean up all dependancies/drift/etc?

    So when you say:

    “We need paradigm shift in the way we architect and deliver “services”.


    “The Container layer will be the point of qualification on both sides: each new variation of hardware will be qualified against a single Container layer, and all software will be qualified (quite literally, providing a fast lane change mechanisms development, test, staging and production (Continuous Integration & Continuous Deployment) against that same Container layer.”

    My question is, how do you adapt to change? If a hot new must have developer tool/library/Swifter comes out you can either tell your developers, sorry not in our package plan (Google’s method) or play catch up once they build it in. Even a skinny container will have change management complexity.


    Also are there chances to pick of certain parts of the service delivery and harder code them? @monadic from Rabbit MQ tells me you could actually make hardware/ASIC just for delivering his messaging software. Does anything like that start to come into play as you get to hyper-scale on an application?


  2. September 28, 2009 11:42 am

    Yup, the OS(es) are now bloated. If we do the cloud right and abstract away the OS (into some deep dark space below) then popping out the OS and replacing it with something better (that does not morph into another bloated something) would be a cloud (IaaS) provider issue. Obviously, the PaaS could be that “abstraction” layer. But then, whose PaaS could that be?

    Realistically, with all the stuff out there that “hooks” into all the OSes out there does make such a vision challenging from a migration standpoint. After all the “hooks” are the interfaces that makes it challenging, similar, to pulling out a gasoline combustion engine and replacing it with something that does something with a non-gasoline source of fuel to enable movement. Also, the dark forces of OS provders are like the gasoline providers too. They will fight us and try to kill us. All the same, having given 23.5 years of my career to a large OS provider with three letters I sure will fight back and quite well since I think I deserved another 7.5 years with them. So, the game does continue for me.

    Long time ago, ok a short time ago, there was what was called a Universal Turing Machine (UTM). It took something simple and produced something simple. It enabled one to use it to create another UTM that did something else and produced something else. As we evolved, we started layering these UTM and splat now we have VMs. Maybe, that magical future OS could indeed be modelled on a UTM. I think I need to flush that thought out anyhow.

    Interestingly, there are so many billions and billions of lines of code (ahh, one day will we talk about the billions and billions of objects too) that invariably will have to be migrated (ok, one means of migration is to throw all that away). That could be challenging since the enterprise has to keep its lights on and all their business logic is buried in that code. Think COBOL. Either way, all the people that know COBOL are rapidly fading away.

    Interestingly, our schools teach people how to write some interesting code (usually, something like Java, Python etc). The rarely focus on how to read code. Today, IT spends maybe 80% on maintenance (think read) and 20% on development (think write) while our future readers (of code) are probably 0% while our writers (of code) are probably 100%. Obviously, after my 30 years in IT I think I can read and write code somewhat quite well. So, I think I will just not fade away, for awhile.

    Anyhow, here is my process to migrate abstract … albeit, an abstract since the details are indeed my means of livelihood.

    Anyhow, I vote for python to be the cloud language. It seems cool. Then again, there are lots of cool languages. So, maybe we need something basic like Backus Naur Form (BNF) or ALGOL to develop our “glue” to squash the bloats (OSes). I think I need to flush that though out too.

    Good post.

    Derik (never blogs just lurks and comments)) Pereira

  3. September 28, 2009 1:28 pm


    Thanks for the props in your comment. I’d be inclined to recommend a virtual appliance rather than a hardware appliance in this case. Being able to move a managed RabbitMQ VM (or AMI or EMI) around, and retain your application addressing, makes it much much easier and cheaper to keep communication going in an “elastic” setting, i.e. as you scale and as applications come and go in the cloud.



  1. Incomplete Thought: Virtual Machines Are the Problem, Not the Solution… | Rational Survivability
  2. Really Interesting Crap In My Browser Tabs: Poor Man’s | Rational Survivability
  3. JeOS, An Open Source Opportunity | CloudAve
  4. Free Cheap Health Insurance Quote

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: