Silicon Startup Raises ‘Prodigy’ for Hyperscale/AI Workloads

By Tiffany Trader

May 23, 2018

There’s another silicon startup coming onto the HPC/hyperscale scene with some intriguing and bold claims. Silicon Valley-based Tachyum Inc., which has been emerging from stealth over the last year and a half, is unveiling a processor codenamed “Prodigy,” said to combine features of both CPUs and GPUs in a way that offers a purported 10x performance-per-watt advantage over current technologies. The company is primarily focused on the hyperscale datacenter market, but has aspirations to support brainier applications, noting that “Prodigy will enable a super-computational system for real-time full capacity human brain neural network simulation by 2020.”

Tachyum says that its Prodigy universal processing architecture marries the programmability of CPUs with the power efficiency and performance features of the GPGPU.

“Rather than build separate infrastructures for AI, HPC and conventional compute, the Prodigy chip will deliver all within one unified simplified environment, so for example AI or HPC algorithms can run while a machine is otherwise idle or underutilized,” said Tachyum CEO and Cofounder Radoslav ‘Rado’ Danilak. “Instead of supercomputers with a price tag in the hundreds of millions, Tachyum will make it possible to empower hyperscale datacenters to produce more work in a radically more efficient and powerful format, at a lower cost.”

AI was a focus during the press activities that accompanied Tachyum’s participation at the GlobeSec conference in Bratislava, Slovakia, last week. Danilak indicated the technology is in the running for a prominent brain modeling project, but otherwise downplayed the AI use case when we interviewed him for this story, affirming the hyperscale datacenter as the company’s primary target. He said, “AI is just 3-5 percent of silicon today, and 95 percent is server, so our chip is shooting for that 95 percent of market.”

The CEO further clarified: “We don’t sell into enterprise market – that would not be fruitful. Our market is hyperscalers. Most of the [target] customers have their own application source code and we provide the full compiler toolchain from open source, like GCC and so on, porting Linux and baseline applications. So in our primary market, we provide tools so they can recompile and go, they don’t need to rewrite applications.”

Source: Tachyum

The thrust of Tachyum’s proposition is that hyperscale servers are only being utilized at 30-40 percent, and are not used in the night because they are off-peak. Prodigy chips can be software reconfigured to run AI at night, enabling “10x more AI for free,” said Danilak.

In a presentation at Flash Memory Summit last year, the CEO discussed the coming datacenter power wall, noting “a new computational mechanism is needed to overcome this plateau.” Further, “ARM A72 not an answer; Intel Atom has similar performance & power; FPGA, GPU, TPU apply only to limited applications versus CPU.”

The Prodigy platform has 64 cores with fully coherent memory, barrier, lock and standard synchronization, including transactional memory. Single-threaded performance will be higher than a conventional core, the CEO said. Each chip will have two 400 Gigabit Ethernet ports.

Power efficiencies are gained by moving out-of-order execution capability to software. “All the register rename, checkpointing, seeking, retiring, which is consuming majority of the power, is basically gone, replaced with simple hardware. All the smartness of out-of-order execution was put to compiler,” the CEO told us.

“We are kind of a hybrid,” he continued. “[The industry has] in-order-execution machines like low-power Arm, but they have not demonstrated good performance on single thread, then you have big machines like Intel Xeon which have very good performance per thread but they are very power hungry. We are able to get the performance of Xeon per thread but power comparable to low power Arm, so we attack and reduce that cost of scheduling by moving hardware to a very complicated piece of the software.”

Citing a paper by Google’s Urs Hölzle enumerating the failings of wimpy cores, Danilak asserted that Google and other hyperscalers passed on low-power Arm because of low-performance, single-thread performance. “So from day one we designed our platform to go to into the server,” Danilak said. “We built a machine which is fastest on single-threaded but also on parallel applications because if you don’t do that, Amdahl’s law will get you. You need to have the non-vectorized parts of the application be really fast too to get the good scaling.”

Danilak claims that that by enabling a 4x reduction in datacenter TCO through improved power efficiency and reduced footprint, hyperscalers like Google and Facebook could save billions of dollars by moving to Prodigy. In terms of performance, the CEO said that a 256,000 server configuration based on Prodigy chips would deliver 32 exaflops of Tensorflow performance. That’s 125 teraflops per Prodigy chip. As a point of reference, Google’s new TPU (v3) chip promises 90 teraflops of unspecified floating performance; Volta with NVlink offers 125 mixed-precision Tensor teraflops. The pitch for Prodigy is that it is applicable for a wider range of datacenter applications.

The Prodigy architecture is fully compliant with IEEE-standard double-precision, single- and half-precision and also 8-bit floating point. The programming model includes C, C++, Java, Fortran, and Ada. “We support full staging, memory system, precise exception, and full coherency system so that allows you to run existing applications and simplifies use and deployment of applications,” the CEO said.

Tachyum says it has found a way around the “slow wire” limitations that impede today’s semiconductor devices. It is working with a fab on a semi-custom COT-flow (customer-owned tooling) design, using 7-nm technology, and expects to have prototypes out next year with sampling to follow. Ahead of tape-out, Tachyum will provide early adopters and other partners with FPGA-based emulation systems.

The CEO acknowledged the non-recurring engineering costs are significant, but indicated that the chips will be priced below Xeons and will offer a performance-per-dollar advantage over today’s high-end CPUs and GPUs.

Danilak has an accomplished track record as a technologist and entrepreneur. He founded ultra-dense flash storage company Skyera and SandForce, supplier of SSD controllers. Skyera was acquired by Western Digital in 2014 and SandForce was sold to LSI in 2011 for $377 million (LSI’s SSD business was later acquired by Seagate in 2014). He was also part of the Wave Computing team that built the 10GHz processing element of deep learning DPU.

Tachyum’s technology has garnered an endorsement from Christos Kozyrakis, professor of electrical engineering and computer science at Stanford. “Despite efficiency gains from virtualization, cloud computing, and parallelism, there are still critical problems with datacenter resource utilization particularly at a size and scale of hundreds of thousands of servers. Tachyum’s breakthrough processor architecture will deliver unprecedented performance and productivity,” said Kozyrakis, who joined Tachyum as a corporate advisor in January.

Tachyum received venture funding earlier this year from European investment company IPM Growth and says it will do one more round at the end of this year to get the chip to production. In March, Tachyum moved its headquarters to a larger facility in San Jose, Calif., and announced it was looking to expand its team.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Nvidia Debuts Turing Architecture, Focusing on Real-Time Ray Tracing

August 16, 2018

From the SIGGRAPH professional graphics conference in Vancouver this week, Nvidia CEO Jensen Huang unveiled Turing, the company's next-gen GPU platform that introduces new RT Cores to accelerate ray tracing and new Tenso Read more…

By Tiffany Trader

HPC Coding: The Power of L(o)osing Control

August 16, 2018

Exascale roadmaps, exascale projects and exascale lobbyists ask, on-again-off-again, for a fundamental rewrite of major code building blocks. Otherwise, so they claim, codes will not scale up. Naturally, some exascale pr Read more…

By Tobias Weinzierl

STAQ(ing) the Quantum Computing Deck

August 16, 2018

Quantum computers – at least for now – remain noisy. That’s another way of saying unreliable and in diverse ways that often depend on the specific quantum technology used. One idea is to mitigate noisiness and perh Read more…

By John Russell

HPE Extreme Performance Solutions

Introducing the First Integrated System Management Software for HPC Clusters from HPE

How do you manage your complex, growing cluster environments? Answer that big challenge with the new HPC cluster management solution: HPE Performance Cluster Manager. Read more…

IBM Accelerated Insights

Super Problem Solving

You might think that tackling the world’s toughest problems is a job only for superheroes, but at special places such as the Oak Ridge National Laboratory, supercomputers are the real heroes. Read more…

NREL ‘Eagle’ Supercomputer to Advance Energy Tech R&D

August 14, 2018

The U.S. Department of Energy (DOE) National Renewable Energy Laboratory (NREL) has contracted with Hewlett Packard Enterprise (HPE) for a new 8-petaflops (peak) supercomputer that will be used to advance early-stage R&a Read more…

By Tiffany Trader

STAQ(ing) the Quantum Computing Deck

August 16, 2018

Quantum computers – at least for now – remain noisy. That’s another way of saying unreliable and in diverse ways that often depend on the specific quantum Read more…

By John Russell

NREL ‘Eagle’ Supercomputer to Advance Energy Tech R&D

August 14, 2018

The U.S. Department of Energy (DOE) National Renewable Energy Laboratory (NREL) has contracted with Hewlett Packard Enterprise (HPE) for a new 8-petaflops (peak Read more…

By Tiffany Trader

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

Intel Announces Cooper Lake, Advances AI Strategy

August 9, 2018

Intel's chief datacenter exec Navin Shenoy kicked off the company's Data-Centric Innovation Summit Wednesday, the day-long program devoted to Intel's datacenter Read more…

By Tiffany Trader

SLATE Update: Making Math Libraries Exascale-ready

August 9, 2018

Practically-speaking, achieving exascale computing requires enabling HPC software to effectively use accelerators – mostly GPUs at present – and that remain Read more…

By John Russell

Summertime in Washington: Some Unexpected Advanced Computing News

August 8, 2018

Summertime in Washington DC is known for its heat and humidity. That is why most people get away to either the mountains or the seashore and things slow down. H Read more…

By Alex R. Larzelere

NSF Invests $15 Million in Quantum STAQ

August 7, 2018

Quantum computing development is in full ascent as global backers aim to transcend the limitations of classical computing by leveraging the magical-seeming prop Read more…

By Tiffany Trader

By the Numbers: Cray Would Like Exascale to Be the Icing on the Cake

August 1, 2018

On its earnings call held for investors yesterday, Cray gave an accounting for its latest quarterly financials, offered future guidance and provided an update o Read more…

By Tiffany Trader

Leading Solution Providers

SC17 Booth Video Tours Playlist

Altair @ SC17

Altair

AMD @ SC17

AMD

ASRock Rack @ SC17

ASRock Rack

CEJN @ SC17

CEJN

DDN Storage @ SC17

DDN Storage

Huawei @ SC17

Huawei

IBM @ SC17

IBM

IBM Power Systems @ SC17

IBM Power Systems

Intel @ SC17

Intel

Lenovo @ SC17

Lenovo

Mellanox Technologies @ SC17

Mellanox Technologies

Microsoft @ SC17

Microsoft

Penguin Computing @ SC17

Penguin Computing

Pure Storage @ SC17

Pure Storage

Supericro @ SC17

Supericro

Tyan @ SC17

Tyan

Univa @ SC17

Univa

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This