What is the “Cloud”?
So, I was at a gathering of friends and one of them asked me: “I keep hearing about the “Cloud”, but I don’t understand it. What is the Cloud?” It’s an interesting question, because like it’s meteorological namesake, its existence is ephemeral and changing all the time.
Let’s start with the name: Cloud. Like many things in technology, the name is largely a marketing term. It’s a marketing term that had its genesis in the technical realm. For years, my colleagues and I used the picture of a cloud in architecture diagrams to indicate network paths and services. Sorta like: I’m in Boston communicating with you in Palo Alto and between us I’d show a cloud to indicate: There are messaging services here, but exactly how the messaging happens is not important to the overall architecture of our work. A few years ago the term cloud was co-opted to mean any storage and services being delivered in the network, usually in the Internet. Again, the “how” is not as important as the “what”.
So, what is the Cloud (with a capital “C”)?
It’s an all-encompassing metaphor for any and all services that are delivered via the Internet (note: There are Intranet cloud services, but let’s focus on what the average consumer sees). As the figure above shows, these services can run the realm of computing services that in the past might have been run on local computers. More specifically, if you use Gmail, Dropbox, on-line photo storage, on-line backup (e.g., Mozy or Quicken on-line backup), synchronization of your address book with your devices and computers, you are leveraging the Cloud.
OK … I get that, but what is it really?
It’s really a fancy term for on-line computing services and storage provided by a given company. Though the term is relatively new, these services have been around for 20+ years. Think about AOL’s: “You’ve got mail!” To make this discussion more concrete, let’s use Google as an example. Google provides a number of services, including search (one of the original Cloud services), email services called Gmail (another original Cloud service), Google Drive, which is on-line storage and Google Apps which provides on-line applications that resemble Microsoft Office.
When you want to use one or more of these services, you browse to one of their URLs, most likely via google.com. This connects you with a server in one of Google’s massive data centers. These data centers contain communication servers that focus on serving Web traffic, storage servers that store your backups, documents and photos, as well as compute servers that provide on-line apps like Google Apps. Without these Cloud services, you’d need to maintain servers in your home or office to provide these capabilities if you’d need them. That would be expensive and a nightmare to setup and maintain.
Tell me a little more about data centers?
Data centers are buildings with large rooms that house racks upon racks of servers and communications gear. A given rack contains from 10-20 rack-mounted computers. For storage servers, there are several rows of disk drives (typically with 16 drives per row) with a few computers to manage the storage. Racks are tightly packed side-by-side for the length of the room. There are several rows of racks to fill out the width of the room. Between all the racks is a serious amount of copper and fibre communication cabling. This is a lot of computer power and storage space. Since these servers throw off a significant amount of heat, cold air is pulled through the racks to keep the components cool.
Modern data center operations are highly automated. When coupled with highly resilient machines, they do not require many personnel to manage and maintain the systems. Smaller data centers are typically run “dark”, which means that unless there is a hardware malfunction, there are no personnel on-site.
The real cost (beside capital costs) to run a modern data centers is for electricity and air conditioning. With this in mind, many data centers are sited in regions that have cheap electricity, which usually has the added benefit of lower personnel costs. These centers use a remarkable amount power to run the servers and the AC.
One more comment about data centers. Large Cloud enterprises have data centers scattered all over the US and Internationally. This provides local presence to the consumer as well as fail-over in the event of a disaster impacting a given data center. One of the lessons of recent events like 9-11 and Superstorm Sandy was the recognition that data and services need to be replicated across two or more data centers that are geographically separated by long distances (usually several hundred miles).
What about security?
That’s a particularly important topic, and complex to boot. Cloud purveyors have a responsibility to safeguard their infrastructure and your data. They have an obligation to insure that their service remains running, secures your data and doesn’t serve malware of any kind. This requires physically securing the data centers and communication paths, as well as monitoring the traffic accessing their servers. Major Cloud providers have sophisticated automated and manual procedures to insure the integrity of their systems. Integrity is at the heart of an enterprise’s brand. A break in integrity can significantly harm that brand, which will certainly show up on the bottom line.
That said, the most important issue for the consumer is how is data flowing and stored in the Cloud secured? You need to be concerned when your data is in transit (between your computer and the Cloud servers) and when it is at rest (on Cloud storage). You should never engage a Cloud service that doesn’t encrypt your data in transit. Look for the URL to start with “https://” and/or a lock icon to indicate that you have a SSL connection (see the Glossary for more information on encryption). This will insure that your data is protected between your device and the Cloud.
What about stored data? That’s a little more difficult to determine. Many providers prefer to store your data unencrypted because that allows them to reduce storage costs by insuring that they can run de-duplication algorithms to reduce data redundancy. For example, say you and 20 other folks have a copy of a document attached to emails stored at Gmail. Google could detect that and only store one copy. The rationale is that the data centers are physically secure, so encryption at rest is unnecessary. For emails, that’s probably OK since you’ve likely sent unencrypted emails across the Internet, which is fair more likely to be intercepted than at Google’s data center. However, for backups, photos, contacts and other sensitive data, its crucial that the data be encrypted, with the keys properly safeguarded. Dropbox and Apple now encrypt your data-at-rest, but you’ll need to do some investigation of your favorite Cloud service.
Some final thoughts:
- It’s nearly impossible to utilize modern technology without touching the Cloud. It’s important to think about how you are and would like to utilize Cloud services. You then need to do your homework to insure that the services you will be using is properly protecting your data. All (most) reputable Cloud purveyors will have information about how they safeguard your data.
- That said, there is likely little to no recourse if your data is compromised or lost. This is especially true if you are using free services.
- Take a realistic view of what you are storing and transiting within the Cloud. Again, data like email should be encrypted from your computer/device to the service (to prevent interception while using public WI-FI), but though preferable, is not crucial to be encrypted at rest(though how companies like Google and Yahoo use your email to target adds is a different, but important privacy concern). Most all other data should be encrypted both in transit as well as at rest. Encryption at-rest protects your data either in the data center or if the disk is stored off-site for backup purposes.
- If you have control, only keep data in the Cloud for as long as required. I use Dropbox for sharing documents between myself and others. I leave sensitive documents in Dropbox for only as long as they are needed, then I pull them down.
- Please keep in mind that even if you remove data from the Cloud, there might be backups of the data. Though it is good policy to remove data that is no longer needed, but don’t make the mistake of assuming it can’t be recovered if subpoenaed.
Finally, this is yet another example of the balance between useful functionality and privacy. To get the former, you are pulling the latter at risk. Making explicit, well researched and thought out choices should help you preserve the security of your data.