Why it’s time to rethink your data center


Commentary: Much of what’s wrong with on-premises computing starts with how we think about the server, says industry veteran Bryan Cantrill.

data.jpg

Image: iStock/ty cgi stock

The problem with servers, said Oxide Computer cofounder Bryan Cantrill in a recent podcast, is that they aren’t servers–they’re PCs. And because they’re essentially PCs, they come with many of the same problems that plague PCs. Cantrill said: “You’ve got security problems, quality problems, reliability problems, efficiency problems. It’s just problem after problem after problem after problem after problem.”

Which is…a problem.

It’s a problem because while a great deal of IT spend is shifting to the cloud, roughly 95% of IT spending remains on-premises. If you’re an enterprise that plans to continue running on-premises workloads, it may be time to think of servers as “machine[s] made for different purpose[s]…[and] design [them] differently,” said Cantrill. Like how?

Your own personal server

As good as the reasons may be for moving to public cloud computing, there are plenty of good reasons for keeping to private data centers, according to Cantrill. Economic reasons. Latency reasons. 

SEE: Managing the multicloud: Challenges, opportunities, and best practices (free PDF) (TechRepublic)

The problem is that public cloud providers tend to think differently about how they build their data centers. Cloud providers have rapidly innovated in terms of how they build and mingle hardware and software. In private server land, by contrast, companies still build the equivalent of a PC. From Cantrill:

The problem is that when you buy a rack of machines, it’s exactly that: a rack of machines stacked on top of one another. That’s it. Those machines don’t know about one another. They’re not engineered together. They’re in a cabinet. There has been no rack-scale design in the enterprise. There’s been lots of rack-scale design in the hyper-scale vendors.

In the private data center, you’ll find a proliferation of power supplies, cabling, etc. You likely have a display port on the machine, despite there being no logical reason for it to be there. (Cantrill posed the question: “Why do we have a display port on a server? That makes no —— sense at all. The reason we have a display port is because it’s a personal computer….This is emphatically not the most efficient way to do it at all.”)

It turns out that it really matters that servers function differently than PCs. It matters for lowering costs, improving efficiency etc. But it also matters for security purposes, as Cantrill stressed:

[At Oxide] we’ve ripped out the BMC [baseboard management controller] entirely and replaced it with a service processor. So perhaps similar in spirit but totally different implementation. A service processor whose job is to manage a serial line thermals, environmental, the fans and so on. That’s a service processor’s job. You do need a little embedded controller to do that. But it doesn’t need to be this kind of outgrowth that is the BMC, and it certainly doesn’t need to be advertising an IP address and hanging out on the internet, where vulnerabilities in that BMC are now vulnerabilities in the brain stem of your data center.

This all started, he suggested, when Intel won the PC market and started extending into servers. This, in turn, gave the server market many of the same issues that PCs had. (Cantrill stated: “You’ve got security problems. You’ve got quality problems, reliability problems, efficiency problems. It’s just problem after problem after problem after problem after problem.”) Oxide approaches things differently, starting from the premise that a server is not the same thing as a PC and so should be designed differently. 

SEE: How hyperscale data centers are reshaping all of IT (ZDNet)

So, rather than simply sticking a bigger processor in a PC and calling it a server, Oxide is approaching servers as servers. All the server vendors, he argued, failed to “recognize[] that the most economical system to build was not just a galactic single machine, but rather smaller machines that were stitched together coherently.” They failed to think through the architecture that would come to permeate the hyperscale vendors and their data centers. Oxide is trying to bring the lessons that hyperscale vendors have learned and apply them to the 95% of global IT spending that is still centered on on-premises workloads.

Will it work? It’s too soon to tell, but Oxide is, at least, asking the right questions about server design. Importantly, the company is also open sourcing its work, so as to build trust with its customers and to allow a community to help build this improved data center together. 

Disclosure: I work for AWS, but the views expressed herein are mine.

Also see



Source link