It’s fascinating to see what a major company has percolating in the lab as it reflects the company’s mid- and longer-term expectations. At Intel Labs Day last week, the chip giant provided a glimpse into five priorities it is pursuing. The list won’t surprise you – quantum computing, neuromorphic computing, integrated photonics, machine programming (think machines programming machines), and what Intel calls confidential computing (think security).
Lab director Rich Uhlig was the master of ceremonies in what was a carefully-scripted and smoothly-run event in this new era of online conferences. While much of the material was familiar, there were deeper dives in all topics as well as a new product announcement in quantum (Horse Ridge 2 controller chip), impressive benchmarks in neuromorphic computing (v. CPU/GPUs), and a few noteworthy collaborators discussing various joint projects.
The unifying concept for the day was unprecedented data growth. There’s an expectation we’ll generate on the order of 175 zetabytes in 2025. As one zettabyte equals 1,000 exabytes, Intel themed its agenda “In pursuit of 1000X: Disruptive Research for the Next Decade in Computing.”
Said Uhlig in his opening remarks, “The first step is to set an ambitious goal with an understanding that we need multiple orders of magnitude improvement in along several vectors of technology spanning: interconnects; compute and memory; and in how we program in secure systems. As a shorthand, let’s call this our pursuit of 1000X”
Here are a few highlights from various presentations.
The question has long been when, not if, optical components will be needed inside chips and servers to achieve greater bandwidth. James Jaussi, senior principal engineer in Intel’s photonics lab, said, “[Photonics] has come a long way, however, because of the current cost, [the] physical size of the silicon photonics modules, and operating power, optical IO has not pushed into the shorter distance interconnects and this is our next big hurdle.”
Intel’s vision is for integrated photonics to drive the cost and the footprint down, he said: “We strive to have the capability of scaling IO volumes from millions to billions, 1,000x increase. Future optical links will make all IO connections emanate directly from our server packages reaching fully across the datacenter.” Jaussi pointed out the following progress points:
- Micro-ring modulators. Intel has miniaturized the modulator by a factor of more than 1,000, thereby eliminating a key barrier to integrating silicon photonics onto a compute package.
- All-silicon photodetector. The industry has long believed silicon has virtually no light detection capability in the 1.3-1.6um wavelength range. Intel showcased research that proves otherwise with lower cost as a main benefit.
- Integrated semiconductor optical amplifier. Targeting power reduction, it’s now possible to make integrated semiconductor optical amplifiers with the same material used for the integrated laser.
- Integrated multi-wavelength lasers. Using wavelength division multiplexing (WDM), separate wavelengths can be used from the same laser to convey more data in the same beam of light.
- Integration: Intel is the only company that has demonstrated integrated multi-wavelength lasers and semiconductor optical amplifiers, all-silicon photodetectors, and micro-ring modulators on a single technology platform tightly integrated with CMOS silicon.
“We feel these building blocks will help fundamentally change computer IO and revolutionize future datacenter communication,” said Jaussi, who also noted Intel’s disclosure last February of 3D stacked CMOS circuits interfacing directly with photonics by stacking two ICS, one on top of the other. “There is a clear inflection point between optical and electrical approaching,” he said.
Of course, many companies, new (Ayar Labs) and old (Nvidia) are feverishly tackling optical performance and packaging issues. The race is on.
Among the noisy quantum community, Intel had been largely quiet until the last year or so. It is focused on silicon-based spin qubit technology than can be fabbed using Intel’s existing CMOS manufacturing expertise. Anne Matsuura, director of quantum architecture, and Jim Clarke, Intel director of quantum hardware and components group, shared presentation duties.
In many ways, Intel has stepped more cautiously into the quantum computing waters.
“We believe that commercial scale quantum computers will enable simulation of these materials so that in the future we can also design materials, chemicals and drugs with properties that we desire,” said Matsuura during the opening session, but quickly added, “Today’s 100 qubits or even thousands of qubits will not get us there. [We] will need a full stack, commercial-scale quantum computing system of millions of qubits to attain quantum practicality for this type of ambitious problem solving.”
Spin qubits promise many significant advantages (coherency time and scalable manufacturing among them) but present the same control drawbacks as all semiconductor-based qubits in being highly susceptible to noise interference. That means they must operate in near-zero degree (K) environments inside dilution refrigerators. To get the microwave control signals to the qubits requires cables to be inserted into those refrigerators. Stuffing a million coax cables into one of these refrigerators is a daunting, perhaps undoable task.
Intel is tackling that problem from a different direction with a integrated cryo-controller chip, Horse Ridge (coldest spot in Oregon), which can be placed inside the fridge close to the chip. It’s a significant change and a potential game-changer. In one of the few news items at Labs Day, Intel announced Horse Ridge 2.
New features enable:
- Qubit readout. The function grants the ability to read the current qubit state. The readout is significant, as it allows for on-chip, low-latency qubit state detection without storing large amounts of data, thus saving memory and power.
- Multigate pulsing.The ability to simultaneously control the potential of many qubit gates is fundamental for effective qubit readouts and the entanglement and operation of multiple qubits, paving the path toward a more scalable system.
Here’s Intel’s description:
“The addition of a programmable microcontroller operating within the integrated circuit enables Horse Ridge II to deliver higher levels of flexibility and sophisticated controls in how the three control functions are executed. The microcontroller uses digital signal processing techniques to perform additional filtering on pulses, helping to reduce crosstalk between qubits.
“Horse Ridge II is implemented using Intel 22nm low-power FinFET technology (22FFL) and its functionality has been verified at 4 kelvins. Today, a quantum computer operates in the millikelvin range – just a fraction of a degree above absolute zero. But silicon spin qubits – the underpinning of Intel’s quantum efforts – have properties that could allow them to operate at temperatures of 1 kelvin or higher, which would significantly reduce the challenges of refrigerating the quantum system.”
It will be interesting to see if Horse Ridge could be used by other quantum computing companies. Intel hasn’t said it wouldn’t sell the chip to others.
Matsuura said, “Scaling is in Intel’s DNA. It is inherent to how we approach technology innovation, and quantum is no different. There are key areas that Intel’s quantum research program is focused on: spin qubit technologies, cryogenic control technology, and full stack innovation. Each of these areas addresses critical challenges that lie on the path to scaling quantum, and Intel is tackling each systematically to achieve scaling.
“We are introducing high volume, high throughput capabilities for our spin qubits with a cryo-probe. This is a one of a kind piece of equipment that helps us test our chips on CMOS wafers in our fabs very rapidly. I mean, we’re talking hours instead of days with respect to time to information; we’re essentially mimicking the information turn cycle that we have in standard transistor research and development. With the cryo-probe, we can get test data and learnings from our research devices 1000x faster, significantly accelerating qubit develop.”
If practical quantum computing still seems far off (and it does), neuromorphic computing seems much closer, even if only in a limited number of applications. Intel is an active player and its Loihi chip, Pohoiki Springs system, and Intel Neuromorphic Research Community (100-plus members) – all taken together – represent one of the biggest vendor footprints in neuromorphic computing.
Mike Davies, director of Intel’s Neuromorphic Lab, covered a great deal of ground. While no new neuromorphic products were announced, he reviewed the technology some detail and INRC (Intel Neuromorphic Research Community) member Accenture talked about three of its neuromorphic computing projects. He also spent a fair amount of time reviewing benchmark data versus both CPUs and Nvidia GPUs.
“Our focus has been on benchmarking Loihi’s performance against conventional architectures, so we can build confidence that neuromorphic chips in general can deliver on the promise. That said, over the past year, several other neuromorphic chips have been announced that sound to also be mature and optimized enough to give good results. That’s exciting because it means we can start comparing the strengths and weaknesses of different neuromorphic architectural and design choices. This kind of competitive benchmarking will accelerate progress in the field; we truly welcome healthy competition from other platforms,” said Davies.
By way of review, neuromorphic computing attempts to mimic how the brain’s neurons work. Roughly, this means using spiking neural networks (SNNs) to encode and accomplish computation instead of classic von Neumann processor-and-memory computing. The brain, of course is famous for working on about 20 watts.
Davies provided succinct summary:
“To date the INRC has generated over 40 peer reviewed publications many with quantified results confirming the promise of the technology to deliver meaningful gains. Several robotics workloads show 40 to 100 times lower power consumption on Loihi compared to conventional solutions. That includes an adaptive robotic arm application, a tactile sensing network the processes input from a new artificial skin technology, and a simultaneous localization and mapping workload or slam as it’s called.
“On our large scale Pohoiki Springs system we demonstrated ‘similarity search’ running with 45 times lower power and over 100 times faster than a CPU implementation. Loihi [can] also solve hard optimization problems such as constraint satisfaction, and graph search over 100 times faster than a CPU with over 1,000 times lower energy. This means that future neuromorphic devices like drones could solve planning and navigation problems continuously in real time.
“All of this progress and results give us a lot of confidence that neuromorphic computing, in time, will enable groundbreaking capabilities over a wide range of applications. In the near-term, the cost profile of the technology will limit applications to either the small scale such as an edge devices and sensors, or to less cost-sensitive applications like satellites and specialized robots. Over time, we expect innovations in memory technologies to drive down the cost allowing neuromorphic solutions to reach an expanding set of intelligent devices that need to process real time data where size, weight and power are all constraints.”
Alex Kass of Accenture, an INRC member, presented three projects involving voice command recognition, full body gesture classification, and adaptive control for mobile robots. “We focused on problems where edge AI is needed to complement cloud based capabilities. We look for problems that are difficult to solve with the CPUs or GPUs that are common today, and we most prefer to focus on capabilities that can be applied across many business contexts,” he said. One use case is in automotive.
Currently, AI hardware is too power hungry, which can impact vehicle performance and limit the possible applications, said Tim Shea, researcher with Accenture Labs. Smart vehicles need more efficient edge AI devices to meet the demand. Using edge AI devices to compliment cloud-based AI could also increase responsiveness and improve reliability when connectivity is poor.
Shea said, “We’ve built a proof of concept system with one of our major automotive partners to demonstrate that neuromorphic computing can make cars smarter without draining the batteries. We’re using Intel’s Kapoho Bay (version of Loihi chip) to recognize voice commands that an owner would give to their vehicle. The Kapoho Bay is a portable and extremely efficient neuromorphic research device for AI at the edge. We’re comparing that proof of concept system against a standard approach using a GPU.”
In developing the POC system, Accenture trained spiking neural networks to differentiate between command phrases and then ran the trained networks on the Kapoho Bay. “We connected the Kapoho Bay to a microphone, and a controller similar to the electronic control units that operate various functions of a smart vehicle. We’re targeting commands that reflect features that can be accessed from outside of the smart vehicle, such as “park here,” or “unlock passenger door,” said Shea. “These functions also need to be energy efficient, so the vehicle can remain responsive even when parked for long stretches of time.”
The first step, according to Shea, was getting the system to recognize simple commands such as “lights on,” “start engine,” etc. “Using a combination of open source voice recordings and a smaller sample of specific commands, we can approximate the kinds of voice processing needed for smart vehicles. We tested this approach by comparing our trained spiking neural networks running on Intel’s neuromorphic research cloud against a convolutional neural network running on a GPU.”
Both systems achieved acceptable accuracy recognizing our voice commands. “But we found that the neuromorphic system was up to one thousand times more efficient than the standard AI system with a GPU. This is extremely impressive, and it’s consistent with the results from other labs,” said Shea.
The dramatic improvement in energy efficiency, said Shea, derives from the fact that computation on the Loihi is extremely sparse. “While the GPU performs billions of computations per second, every second, the neuromorphic chip only processes changes in the audio signal, and neuron cores inside Loihi communicate efficiently with spikes,” he said.
Davies presented a fair amount of detail in a break-out discussion that is best watched directly.
Efforts to maintain data security and confidentiality are hardly new. Intel presented its ongoing efforts in that arena, which involves big bets on federated learning, homomorphic encryption, and recently the launch of the Private AI Collaborative Research Institute “to advance and develop technologies in privacy and trust for decentralized artificial intelligence.”
“Today, encryption is used as a solution to protect data while it’s being sent across the network and while it’s stored, but data can still be vulnerable when it’s being used. Confidential computing allows data to be protected while in use,” said Jason Martin, principal engineer in the Security Solutions Lab and manager of the Secure Intelligence Team.
“Trusted execution environments provide a mechanism to perform confidential computing. They’re designed to minimize the set of hardware and software you need to trust to keep your data secure. To reduce the software that you must rely on, you need to ensure that other applications or even the operating system can’t compromise your data. Even if malware is present. Think of it as a safe that protects your valuables even from an intruder in the building,” he said.
Federated learning is one approach to maintaining security.
“In many industries such as retail, manufacturing, healthcare and financial services, the largest data sets are locked up in what are called data silos. These data silos may exist to address privacy concerns or regulatory challenges, or in some cases that data is just too large to move. However, these data silos create obstacles when using machine learning tools to gain valuable insights from the data. Medical imaging is an example where machine learning has made advances in identifying key patterns in MRIs such as the location of brain tumors, but is inhibited by these concerns. Intel labs has been collaborating with the Center for Biomedical Image Computing and Analytics at the University of Pennsylvania Perelman School of Medicine on federated learning,” said Martin.
With federated learning, the computation is split such that each hospital trains the local version of the algorithm on their data at the hospital, and then sends what they learned to a central aggregator. This combines the models from each hospital into a single model without sharing the data. A study by UPenn and Intel showed federated learning “could train a deep learning model to within 99% of the accuracy of the same model trained with the traditional non-private method. We also showed that institutions did on average 17% better when trained in the Federation, compared to training with only their own data,” said Martin.
Homomorphic encryption is a new cryptosystem that allows applications to perform computation directly on encrypted data without exposing the data itself. The technology is emerging as a leading method to protect privacy of data when delegating computation. For example, these cryptographic techniques allow cloud computation directly on encrypted data without the need for trusting the cloud infrastructure, cloud service or other tenants.
“It turns out in fully homomorphic encryption, you can perform those basic operations on encrypted data using any algorithm of arbitrary complexity. And then when you decrypt the data, those operations are applied to the plaintext,” said Martin.
The challenge with homomorphic encryption is dataset size. “However, there are challenges that hinder the adoption of fully homomorphic encryption. In traditional encryption mechanisms to transfer and store data, the overhead is relatively negligible. But with fully homomorphic encryption, the size of homomorphic ciphertext is significantly larger than plain data, in some cases 1,000 to 10,000 times larger,” he said.
Programs creating programs is a much-discussed topic in HPC and IT generally. Software development is hard, detailed work, and seldom done perfectly on the first pass. According to one study, programmers in the U.S. spend 50 percent of their time debugging.
“Think about machine programming helping us in two simultaneous directions,” said Justin Gottshlich, principal engineer and lead for Intel’s machine programming research group. “First, we want the machine programming systems to help coders and non-coders become more productive. Second, we want to ensure that the machine programming systems that do this are producing high quality code that’s fast, secure.”
At Labs Day, Intel unveiled ControlFlag – a machine programming research system that can autonomously detect errors in code. In preliminary tests, ControlFlag trained and learned novel defects on over 1 billion unlabeled lines of production-quality code.
“Let me describe two concrete systems that our machine programming team has developed and is working to integrate into production quality systems, just as a reference. We’ve built over a dozen of these systems now, but in the interest of time, we’ll just talk about these two. The first is a machine programing system that can automatically detect performance bugs. This system rich actually invents the test to detect the performance issues. [H]istorically, these tests have been created by humans. With our system, the human doesn’t write a single line of code. On top of that, the same system can then automatically adapt those invented tests to different hardware architectures,” said Gottshlich.
“The second system that we’ve built also attempts to find bugs. But this system isn’t restricted to just performance bugs; it can find a variety of bugs. What’s so exciting is that unlike the prior solutions of finding bugs, the machine programming system that we’ve built, and we literally just built this a few months ago, learns to identify bugs without any human supervision. That means it learns without any human generated labels of data. Instead, what we do is we send this system out into the world to learn about code. When it comes back, it has learned a number of amazing things, we then point it at a code repository, even code that is production quality and has been around for decades.”
For fuller peak into Intel Labs Day: https://newsroom.intel.com/press-kits/intel-labs-day-2020/#gs.mzg9zq