Eight years ago I was sitting in a focus group sponsored by IBM, who was interested in seeing the reaction to a potential product offering they called, "Utility Computing". The pitch was a solution to a classic problem faced by all administrators of IT infrastructure, which is to find the correct mix of anticipating growth versus making good use of existing equipment when planning hardware procurement. In the classic data center model, a company would manage expensive stacks of hardware, much of which would be underutilized due to the fact it costs less to have extra equipment than to run the risk of not being able to scale. However, few companies have the unlimited budget required to plan for every possible IT contingency. IBM's basic idea was that your computing resources could be provided to you from an external utility service, in the same way as electricity and water, and if you needed more you could simply, "turn the tap". This could save a company a lot of money, since you'd only pay for what you used. Additionally, you'd reduce the risk associated with failing to plan correctly for increases in a company's IT requirements.
The concept of Utility Computing was well received by our group. Most of us could see the benefits. Objections included some fear over allowing important company assets, such as proprietary information stored in our data warehouses, to reside outside company walls on the equipment of a third party, possible even on hardware shared by competitors. For some of us in regulated industries there were legal concerns about data security. In the end, the consensus around the table was that such a service could find a home in our respective infrastructures, but it wouldn't replace them.
The Utility Computing concept IBM showed me was primarily targeted at large companies who were trying to reduce the risk and costs associated with maintaining large data centers. In contrast, Cloud Computing services, such as Amazon's EC2 (http://aws.amazon.com/ec2), are accessible even to small companies or individuals operating on a shoestring budget. Anyone can create an account from their web browser. This opens up all sorts of possibilities. It can allow a startup to scale quickly without having to incur the expense of purchasing or leasing expensive equipment. It can also allow an established company to purchase extra computing power when they need a place to experiment but don't have available internal resources. Lastly, the availability of prebuilt 'images' for most Cloud Computing providers can allow you to very quickly roll out a new service to your clients, as long as a server image for your need exists.
Recently I had the change to speak with Frank Speiser, who is leading several Cloud Computing projects that should be of interest to the Perl Community and to Catalyst developers in particular.
Q. What is Cloud Computing and why is it important to Perl?
Cloud computing is
on-demand, virtualized computing. You don't need to own the hardware,
or even see it, in order to run your software.
Most cloud computing providers offer the ability to pay-as-you-go,
and to buy computing power as you need it. In many cases, you configure
a "machine image", which is a base OS install plus all of the stuff
you'd like to run on it, and then you’d freeze
it to use when you spawn up an instance of that machine image.
This is the natural outgrowth of "commodity computing". As a
software owner, or software builder, you really care about access to
computing power, and that is scalable and available. Cloud computing
solves a lot of problems that have been a real issue
for growing businesses or businesses with peak load times. You used to
have to plan to carry enough hardware to cover your peak, plus some
additional room for insurance. Now, you just have to be able to balance
and scale your application if you have thought about architecting
in such a way as to allow your app to scale horizontally.
If you properly design your applications so that you can scale
them by balancing / routing requests, you can build machine images or
application profiles of what you need, and as demand for them
increases, you can scale horizontally. There are Facebook-app-specific
cloud computing instances out there, as those apps are self-contained
in scope by nature, and it lends itself very well to provisioning only
when demand warrants it, but large organizations can take advantage of
this type of thing too. For instance, why would
you run a machine that costs $3-8 per hour with power and maintenance
factored in, when you can run a machine for 1/10 to 1/20th of that --
so long as the stability for such a thing has been demonstrated? If
there's a way to mitigate that risk, it's almost
your duty as an IT Manager to make sure thos cost savings are realized.
There are companies like
GoGrid,
Joyent,
Amazon and
Terremark all
operating in this space, in slightly different ways. The opportunity to
cut your costs is out there, though.
Q. What Cloud Computing related projects exist for Perl developers?
Currently, there are only two Perl projects relating specifically
to cloud computing of which I am aware. There's Jeff Kim's great work
on Net::Amazon::EC2 (
http://search.cpan.org/dist/Net-Amazon-EC2), which allows for the management of Amazon's
virtualized services through a Perl API, and then there's
the Net::Cloud (
link forthcoming) project. I registered that namespace about two years
ago, but back then there was only really one viable cloud computing
option, and it was a closed Beta, which I was fortunate enough to get
into with a recommendation from Ruv Cohen over at Enomaly
(a provider of cloud management software). The project is due for its
initial release on CPAN within the next few weeks. I'm working to roll
in as many initial cloud provider services as has been feasible. Some
of my work extends Jeff Kim’s Net::Amazon::EC2
modules and I have to give him a lot of credit because he was the first
one to do a lot of this stuff, so when I needed to settle on a
standard, I went with what Jeff had done.
Q. What is the best way for a Perl programmer to get started developing for the Cloud?
There's already a big resurgence in
Perl around the Catalyst (
http://www.catalystframework.org/ ) MVC
framework, headed up by Matt Trout, and Moose (
http://search.cpan.org/dist/Moose) by Stevan Little. I think
people should give those a look, because it gets them developing with a
proper design pattern that is well suited for
projects that organizations want completed RIGHT NOW. To that end, I
went and set up two machine images over at Amazon's EC2 that have
everything you need to get up and running with an MVC framework, in the
time it takes you to register an account and read
through the instructions.
The machine ids at Amazon are:
ami-bdbe5ad4 developer-tools/Debian-Etch_Catalyst_DBIC_TT.manifest.xml
ami-9fbe5af6 developer-tools/Fedora8-Catalyst_DBIC_TT.manifest.xml
You have one instance with
Debian (Etch) and one with Fedora8.
Take yer pick. The Debian instance includes
svk and git, but aside from
that, there’s pretty similar in terms of what you can do. I set them up
in such a way that if you actually do something
that gains traction, there’s source control there for you to call in
the cavalry. Just remember to save your work somewhere, because if a
cloud instance is terminated for any reason, you lose your unsaved
work.
The effort to get up and running ON the cloud is minimal -- and it
should be. Architecture and scaling, which has been like a black art
for years, shouldn't be so difficult. There's plenty of room to be
clever, so smart folks will always be employed in
this field. There are better problems to solve. Porting existing apps
and building Perl (
http://www.perl.org/ ) extensions to make that possible are a great way to
get paid while you use cloud services. There’s also a real need and
opportunity for Perl development in many organizations
to simplify architectural designs that grew out of
dive-from-the-hand-grenade necessities of scaling to meet demand. In
this economy, saving people money is not going to go out of style.
I remember the first time I saw Brad Fitzpatrick's slides on how
he grew LiveJournal (
http://www.livejournal.com/), and I remember thinking, "Wow, that is briliant,
but it is also crazy." He has a slide in there specifically intended
for people not to faint, because that's how complicated
it grew to be, managing all those devices. I mean, all that hardware
was just amazing… AND IT WORKED! The fact that LiveJournal and Danga
produced all that great stuff is a testament to the folks that worked
there, and they were doing their thing well in advance
of this whole emergence of cloud computing. But the fact is, Brad
Fitzpatrick and his team isn't available everywhere, and the risk
associated with trying to duplicate that would prohibit most places (in
fact all but a few) from ever replicating that success.
Nowadays, you have cloud computing, Hadoop filesystem (
http://hadoop.apache.org/), perlbal (
http://danga.com/perlbal/) and
memcached (
http://www.danga.com/memcached/) (thanks Danga!), and either squid or commercial reverse proxy
systems. A lot of these can both be moved to the cloud and be made more
service-like. This is right up Perl’s alley.
The more you can decouple the service from the hardware, the better off
you’ll be in your organization. Unless your organization makes its
money by selling hardware solutions coupled with software. In which
case, grab a life raft now.
Having these images are a big time saver for people wanting to get started using Perl and Catalyst. Catalyst can take some time to install, particularly if you are limited to an older version of Perl. Here's a recent article with a step by step plan to getting started with EC2: http://www.onlamp.com/pub/a/onlamp/2008/05/13/creating-applications-with-amazon-ec2-and-s3.html
Additionally, look to the future to see documentation about how to get started with the Catalyst EC2 images integrated into the core Catalyst manual. Or see the online documentation at Amazon: http://aws.amazon.com/ec2
Come back soon when we have
a screencast video showing step by step how to get started using the
EC2 Catalyst development machine images.
Q. What are the next steps for your Cloud Computing plans related
to Perl? Where would you like to see Perl in the Cloud Computing space
over the next 18 to 24 months?
The goal of this project
is to allow for the basics, and build from there. You should be able to
start, stop, load balance and cache across one or multiple clouds:
think "DBI for cloud computing" with the various "condensers" for each
cloud provider or
virtualization API being congruent to the DBD effort that has proven a
very successful and economical Perl implementation in many
organizations. Perl is great at integration, and holding things
together. As such, it makes an excellent choice for a transfer
layer among computing services. Now that the basics of the project are
in place and we’re about to roll out the initial release, there will be
quite a few opportunities for Perl developers to get involved in
contributing to cloud computing. There’s a lot of
brilliant Perl developers out there, and I think that they only get
amped up about new and improving technologies. The potential of
virtualized cloud computing is enough to do that, I’d hope. I am
inviting anyone that wants to contribute to the project. There’s
room enough for everyone, and this type of project is going to have a
few different outgrowths. There’s a lot of room for involvement.
The first step is to make a very stable and portable interface to
all cloud and virtualized environments. We want to enable people to get
up and running on a cloud and demonstrate that it is real, it works,
and is viable. The way to do this is to develop
the gateway that allows people to make this happen, and make it as easy
to use as possible. I think with this release of Net::Cloud, I have
that covered.
From there, we will split the project into providing architectural
support and portability, and then allowing access to a growing number
of specialized instances across all the clouds. There will be
caching-specific instances, genomic computation specific
instances, and possibly neural networks, but it doesn’t need to be so
academic, either. The idea is to provide a uniform interface layer for
people to issue publicly available software as a service (if I can
borrow a buzz phrase), and make that easy for developers
to tie it all together. Although I plan to build a gateway REST-based
service for access to one or multiple clouds, I don’t intend to keep it
specifically Perl-centric. The idea here is to provide the bridge and
standardize, hopefully letting the good languages
(more specifically good developers) rise to the top.
Looking longer-term, I would like to see a lot of IPC stuff move
to allow for slower-but more scalable transport across the cloud, or
multiple clouds for redundancy. I am not talking about simple
"virtualization" in the sense that comes built into machines
rolling out of the systems manufacturers. I mean actual "virtual
process communication". Also, databases should be segmented to make
full use of the cloud, clustering and replication and anything you can
replace in a database using Hadoop or GFS will probably
happen sooner or later, depending on how easy and reliable we
developers make it, to make the leap. I think that stuff is maybe two
years out, but it’s coming. It won’t be long before that type of talk
is not taboo among large organizations. My advice would
be to start pitching the seeds of these ideas now, and build up
acceptance in your organization for them.
For the non Perl developers, the mentioned modules 'DBI' and 'DBD' refer to the database access framework used universally in the Perl community. DBI is similar to ODBC or JDBC, while DBDs are database specific drivers. For more information see the documentation at: http://search.cpan.org/dist/DBI
Q. Can Perl's Cloud Computing projects do anything to assist people who are using other programming languages?
Absolutely. Like I alluded
to in the last question, I’m looking at this as a way to extend many
existing computational limitations. If you have an existing project,
setting up cloud images and using a REST-based call to a server running
Net::Cloud should
help a lot of people doing things that don’t have much to do with Perl
at all. The idea is to make it so that someone doesn’t need to master
the ins and outs of Perl, Catalyst of Net::Cloud to use it.
I would love the help of other front-endian developers doing Ajax
and UI work to help build some slick tools to go on top of this stuff,
that we would ship with existing cloud management instances, or deploy
on you local machine to get up and running,
a la YUI.
Q. What can the community do to help? What are the current communication channels (mailing list, IRC, etc).
When this release hits,
register for some cloud services, and try it out. Start a few machine
instances, stop them, and then pick one or two services which you’d
want to test scale for. Maybe it’s encoding video formats, or managing
sessions. If you make
a plugin that works, abstract it and I’ll gladly include it in
Net::Cloud and mention you in the release notes as a contributor.
Aside from that, I want to do something here that the Perl
community has traditionally overlooked. I want to put a polished
package on this, that includes an easy-to-install, easy-to-use
interface. I don’t care if that is “trivial”. I think the missing
piece for wider co-operation and adoption here is just a cloud solution
that looks good and the price is right. The fact that this release is
free should satisfy at least one of those requirements.
Currently the best place for discussion is(IRC) #net-cloud at irc.perl.org
Q. A little about yourself, your interest in Perl and what
inspired you to start with Cloud computing. Also, some details about
your role related to some of the discussed projects.
I have been interested in
computing -and generally taking things apart to see what they do-
since I was an early age. Sometimes I was even been able to put those
things back together. This tendency sort of led me out of the business
program in college,
and into building web sites and doing commerce-based stuff. My first
non-static website was built with hand-rolled Perl CGI, then I had a go
with Java, then back to Perl. I’ve worked with PHP, Python, Ruby, and
now back to Perl. I keep coming back to Perl because
it seems to create resourceful bridges amongst diverse systems. This is
a reflection of the people in the community as well as the language
itself. I have never considered myself painted into a corner with Perl.
It’s improvisational enough to do new things,
but has voluntarily enforced conventions enough to make it work at
scale. Sort of like a reflection of a healthy society, in a way. I do
not think this was an accident on Larry Wall’s part, but that’s a topic
for another discussion.
People say that bad Perl code is unreadable, but I’d say that code
that doesn’t run is worse than code that doesn’t read well. There’s
varying degrees of cleverness, readability and elegance in all
languages and it’s a balancing act. I think Perl and the
community around it is full of intelligent and motivated people that
are necessarily skeptical and inquisitive. Some code is great and some
isn’t. There’s a guy, a legend basically, Paul Lindner at Hi5 right
now, and he has a knack for taking a complex problem
and solving it in a simple, solid-state way, and being pretty
accommodating to anyone that has to read the code. That’s the kind of
Perl we should all be shooting for. There’s another guy I met while in
Boston.pm, named Ronald J. Kimball, and he writes some
impressively well presented code as well. Some of these guys have moved
up and moved on, because their skill set is so valuable, but I think
the culture we have in place in the Perl community still breeds that
type of meritocracy. Stevan Little and the Moose
guys are an example of that. They're doing great stuff. The team at
Shadowcat is doing great stuff as well, and the Moose, Catalyst and
DBIx::Class communication channels are very active and helpful.
My involvement with this cloud initiative is that I am the author
/ maintainer of the Net::Cloud modules and namespace, and I have been
using cloud computing services since Amazon made them available in
2006. I had a semi-successful Beowulf cluster cobbled
together with some buddies back in maybe 2001 or so, so it has always
been interesting to me to use and share excess computing power. I even
ran that
SETI app when it first came out, but I stopped when it
produced no publicly reported aliens ;) .
Recently, I see how it has become impractical to have to sustain a
static top-end of your computational ability. It’s expensive, and
impractical, especially if you ever start trying to solve bigger
problems or scale. To that end, I have started this project
and I’m working with some great folks on getting it off the ground. I’m
looking forward to building on this, because I think it is the next
frontier in computing and connecting systems. There’s already a bunch
of bright people involved with it, and there’s
always something new to learn or to try while paying the bills, which
is why most of us got into this profession to begin with.
Some additional resources:
Note: I followed up with IBM regarding their OnDemand services and how there were positioned against Cloud Computing services such as Amazon's EC2. I received the following email in response:
"Thank you for your feedback to the On
Demand Community site. We are the volunteer site for IBM employees and
retirees not the On Demand Business site. I did search on www.ibm.com
and the link below provides information about On Demand Business. I
hope this provides the information you need. "
http://www.ibm.com/Search/?q=ondemand&v=16&en=utf&lang=en&cc=us&Search=Search
However I found the search page that link returned to not be very useful. I send a request for additional information and will relate anything I receive in response.