“We’re soon going to hit a crossover point where there will be more computing power at the Edge than in data centers or the cloud,” Justin Boitano, Nvidia’s senior director of enterprise and Edge computing, told DCD.
His prediction is shared by many in an industry preparing for a world where sensors proliferate, Internet of Things systems populate our cities, and the Edge rules.
But, with no single computing architecture dominating this growing field, this shift means there is a huge market potentially up for grabs – and everyone wants a piece of it. There’s a fight going on, between the leading chip designers, cloud giants, and plucky startups, all of which want to build the processors that will run the Edge.
The race to the Edge
Intel is one such contender: With the company’s CPUs dominating the data center, it hopes to translate that success to the Edge.
“What we’ve done with the second generation Xeon Scalable Processor is put features in there that make it really ideally suited for processing at the Edge, and the Edge could be either close to the devices themselves, or the Edge of the network,” Intel senior principal engineer Ken Shoemaker said.
Senior network engineer Tim Verrall linked this to the telephone network, whose exchanges (often referred to as “central offices”) are important in many Edge network proposals. They serve as convergence points for local telephone systems (the “local loop”) and the telecommunications providers’ long-haul backbone networks, and have pre-existing power and cooling infrastructure.
Next generation processors are “ideal for the next generation central office, which is in the early stages of deployment,” said Verrall. “There’s a large number of central offices, and they are typically within about 10 miles of the end point – that’s where the Edge is being deployed today.”
Verrall continued: “The amount of traffic that 5G is likely to put on the network is really going to force services to be offered at the edge and the telecoms providers are going to require this offloading of data. Otherwise, their backbone networks are going to be overwhelmed.”
But a local loop of up to ten miles may place the central office too far from the Edge for some applications: Self-driving cars, for example, cannot afford any latency, and companies like Renesas, NXP, Tesla and Intel are competing to develop hardware that runs in the vehicle itself.
For other connected Edge devices, like security cameras, investing in on-device hardware that does some pre-processing yields savings. “If you think about that camera – say it is focused on a door, 99 percent of the time the door is closed, so that video sensor can make the assessment that the door is closed, nobody’s walking in or out, and you don’t have to send any data back,” Mohamed Awad, Arm’s VP of infrastructure business, told DCD.
“At some point, the door opens, and somebody walks through the door, and the video sensor may distinguish that this isn’t a person that’s supposed to walk through the door, and therefore wants to send it back to a mobile Edge computer, which then does some facial recognition to determine who it is,” Awad explained. “And then it sends that data up to the cloud, and the cloud does some further analysis.”
Awad sees it as an end-to-end system, where “heavy compute is going to run closer towards the core, towards the data center, while the lighter compute is going to run more towards the Edge where there’s more sensitivity around power and cost and all that kind of stuff.”
Arm, while it has struggled in the server CPU space, has a huge footprint at the Edge: its chip designs are found in more than 130 billion products, including essentially all smartphones. Its owner, Japanese telecoms giant SoftBank, sees a much larger market ahead – touting the lofty goal of one trillion devices. “It’s not that far off,” Awad said.
This explosion at the Edge is coming at the same time as another massive transformational change: Artificial intelligence. Take the security cameras – each could use AI processing to filter out unnecessary video data and highlight relevant anomalies, something Nvidia hopes to cater for with its new EGX Platform, a reference architecture coming in various sizes, from the tiny Jetson Nano, up to a full rack of T4 servers.
“Depending on how many cameras you’re trying to analyze, there’s going to be a range of hardware solutions under our EGX platform,” Boitano said. “An AI micro server running Azure IoT can process about four cameras in the small form factors, and then a full rack can process up to 3,000 cameras.”
Intel’s VP and COO of its AI Products group, Remi El-Ouazzane, sees a similar market opportunity for his company: “The biggest problem with vision workloads is bandwidth, especially in the surveillance space. If you’re sending 4K or 8K video, 30 frames per second to your system, and you deploy thousands of cameras – your network won’t bear it, your storage won’t bear it.
“You need to do AI at the Edge, to focus on what to react to or send back.” El-Ouazzane, who was CEO of low-power computer vision chip company Movidius prior to its acquisition by Intel, again sees the Edge as spread across various layers, from the device, to an aggregation point, to perhaps an Edge server, and then the data center.
“You’re dealing with different power envelopes depending on if it’s an end device or an aggregation point. When it comes to devices, for AI acceleration, the power envelope is sitting anywhere between a matter of milliwatts, up to three watts. When you’re looking at aggregation points, you’re getting between 10 watts of power dissipation up to 50 watts.”
Intel and Nvidia are far from alone in targeting the Edge AI market, with a bevvy of startups hoping that this new front in the AI chip market provides an opening.
Enter the newcomers
“I’m focused on the Edge,” Orr Danon, CEO of Israeli chip company Hailo, told DCD. “Most processing will happen at the Edge, where there’s much more data that you want to digest into simpler representations.”
Fresh off of a $20m funding round, the small company hopes that its Hailo-8 processor will end up in everything from security cameras to autonomous vehicles, drones and more.
Hailo’s 26 teraops (26 trillion operations per second) chip consumes almost 20 times less power than Nvidia’s Xavier AGX on ResNet-50 benchmarks, the company claims. “We look at things that are from milliwatts to a few watts, we’re not looking at things that are a hundred to 1,000 watts,” Danon said.
Facing goliaths like Nvidia, Intel and Arm with their huge teams and giant warchests, it is tempting to dismiss new approaches like Hailo. Danon, unsurprisingly, disagrees: “If you look from a historical perspective, every time there was a big shift in computers or the purpose for which computers are used, a huge opportunity was created. And the winners were never the established players – never,” he said, highlighting how IBM failed to move on from mainframes, and how Intel failed to capitalize on the rise of mobile.
Danon believes that “when you’re looking at an evolution of architectures, the player with the most resources and the market access always wins, but when you’re looking at a revolution it actually goes the other way around. Your legacy and commitments slow you down,” he said, citing examples like Google versus Yahoo, and Facebook versus Google.
“So the question – and I think that will determine who will win – is how big of a revolution is deep learning at the Edge? How deeply does it represent a change in the compute paradigm than what we have had?”
Arm’s Awad also foresees a revolution, one which will open the market to various forms of computing architecture. “I mean, listen. We have an architecture, so certainly we want a unified architecture, and we want it to be ours. But we’re realists about it, no one architecture is going to be able to solve all of the problems of dealing with a trillion devices. It’s going to require lots of different types of compute that exists in lots of different places with lots of different profiles. And that’s okay.”
If you accept this new computing world of various architectures processing various aspects of data between the Edge point and the data center, “then you can start to free yourself from this notion of, ‘Hey, I’m going to develop for a particular architecture that’s going to exist in a specific place,’ and you can start thinking about things more around, ‘hey, I want to just develop my workload,’” Awad said.
“This is what developers tell us, they say ‘I just want to develop my application, and I just want somebody to give me a service-level agreement which says that I’m going to get a certain amount of bandwidth, a certain number of compute cycles, and a certain amount of latency.’ Where that workload actually exists, what piece of hardware it’s sitting on, doesn’t actually matter that much.”