CSCS Top Right Frontpage
HPCwire

Since 1986 - Covering the Fastest Computers
in the World and the People Who Run Them

Language Flags

Visit additional Tabor Communication Publications

Datanami
Digital Manufacturing Report
HPC in the Cloud
Green Computing Report

Tabor Communications
Corporate Video

Analyst Weighs In on 64-Bit ARM


In a recent report in Real World Technologies, chip guru David Kanter dissects the new 64-bit ARM design (ARMv8) and what it might mean to the IT landscape. His take on the architecture is almost uniformly positive, noting that not only did the designers manage to develop an elegant instruction set that was backwardly compatible with the existing ISA, but they also took the extra step to jettison a few of the poorly designed features of the 32-bit architecture.

Announced in October 2011, 64-bit ARM is the biggest makeover the processor architecture has received in its 26-year history.  The first implementation in 1985, ARM1, was a 32-bit chip developed for Acorn Computers (ARM = Acorn RISC Machine). Although the architecture never caught on in the PC biz, it's simple, low-power RISC design made it a natural for embedded/mobile SoC applications and microcontrollers.

While the server and personal computer world moved on to 64 bits, ARM was safely ensconced in the embedded/mobile space where 32 bits of addressing (basically 4 GB) was plenty.  But now that devices like tablets and other mobile gadgets are pushing up against this limit, a larger address reach will soon become necessary. Also, the expanded address reach will allow ARM chips for the first time to enter the server market and compete against the x86, the processor architecture that has dominated the datacenter for decades.

In a sense, ARM is trying to duplicate the success of the x86 when it made its own jump from 32 to 64 bits in 2000.  In that case, the 64-bit Intel Xeons and AMD Opertons ended up displacing a lot of their high-end RISC-based competition  -- especially SPARC and Power. If 64-bit ARM ends up cutting into the x86 share of the server market, it would be fitting revenge for the RISC faithful.

As mentioned, before the most critical enabling feature for 64-bit ARM is extending the address space. Although 64 bits could reach 16 exabytes, there's little application demand to access data at that scale.  For the time being, only 48 bits will be used to form an address, which gives software a 256 GB address reach.  Presumably, additional address bits can be tacked on in the future as applications scale up.

With the ARMv8 design, integer and floating point structures are also being enhanced, with all general purpose registers being extended to 64 bits.  The floating point design has been tweaked to support IEEE754-2008, including additional  instructions to make the architecture compliant with the standard.

For vector operations, the changes are more extensive. In the 32-bit spec, the SIMD design (known as NEON) already contained 32 64-bit registers, which could be aliased to 16 128-bit pseudo-registers.  For the 64-bit design, that's been extended to 32 128-bit registers, with the lower half being used if only 64 bit values are needed.  Not only does that double the capacity of the vector unit, it makes for a somewhat cleaner arrangement. The SIMD design also adds full IEEE support and double precision floating point operations.

Curiously missing form ARMv8 is multi-threading support, a feature common to all other major server CPUs -- x86, SPARC, Power, and even HPC processors like the Blue Gene/Q ASIC (PowerPC A2). Kanter speculates that the ARM designers decided to forego multi-threading for now since it is notoriously difficult to validate, and the new design already encapsulated a lot of changes.  Although the jury is still out on the aggregate benefit of this feature, for certain classes of software, the lack of multi-threading support could turn out to be a decided disadvantage.

Overall though, Kanter likes what ARM developers have come up with, which he says is "clearly a sound design that was well though out and should enable reasonable implementations."  As he notes though, there are currently no chip implementations around to judge the the architecture's performance in the field.

But within a couple of years, we should see multiple 64-bit ARM SoCs at various segments of the market -- everything from high performance computers to workstations. Applied Micro already has an FPGA implementation of ARMv8, which the company unveiled in October 2011 and subsequently demonstrated running on an Apache web server. Samsung, Qualcomm, Calxeda, Microsoft, Marvell and NVIDIA have either stated plans to implement a chip or have already bought licenses. At this point, NVIDIA is the only one that has specifically talked about a 64-bit ARM implementation (Project Denver) aimed at HPC, but Calxeda also has high performance computing on its radar.

Samsung is a particularly interesting entrant to the market. The Korean firm is mostly in the consumer electronics business and its involvement in the server space is currently confined to supplying DRAM and flash components. But Samsung would make a formidable competitor against Intel in the server chip arena if the company funneled its resources there. While Intel has more than twice Samsung's revenue today, the latter company is growing at a much faster rate.

That led industry analyst firm IC Insights to project that Samsung would eclipse Intel as the world's largest supplier of semiconductor parts by 2014. Coincidentally, that's that same year the company plans to roll out its first 64-bit ARM server chips. As Kanter concluded: "Certainly, the next few years should be very interesting."

Sponsored Links

Accelerate your science with Seneca
One of the first HPC providers installing a 4X NVIDIA Kepler K-20 cluster. Invites you to a free evaluation on Seneca’s NVIDIA K20 Kepler cluster, pre-loaded with AMBER, NAMD, LAMMPS

High-Performance Computing in Action
Businesses that want to be on the cutting edge of their industries are increasingly turning to high-performance computing (HPC) solutions to handle complex compute processes and speed up their rate of innovation. Download this Executive Brief to see how businesses in energy, life sciences and entertainment put HPC solutions to work in their operations.

Webinar: Programming Heterogeneous X64+GPU Systems Using OpenACC
Join Michael Wolfe as he compares the advantages and costs of using both low-level models and the directive-based OpenACC model for programming accelerated heterogeneous systems. Registration is free.

May 22, 2013

May 21, 2013

May 20, 2013

May 17, 2013

May 16, 2013

May 15, 2013

May 14, 2013

May 13, 2013

May 10, 2013


Most Read Features

Most Read Around the Web

Most Read This Just In

Cray CS300-LC

Short Takes

Building Supercomputers with Raspberries

May 22, 2013 | At some point in the not-too-distant future, building powerful, miniature computing systems will be considered a hobby for high schoolers, just as robotics or even Lego-building are today. That could be made possible through recent advancements made with the Raspberry Pi computers.
Read more...

Running Computational Fluid Dynamics in the Cloud

May 16, 2013 | When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...

Computing the Physics of Bubbles

May 15, 2013 | Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...

Internet2 Awards Program Seeks Innovative Applications

May 10, 2013 | Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...

Sponsored Whitepapers

Best Practices in Big Data Storage

05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.

Progress in Parallel: the Bull Parallel Programming Center

04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.

Sponsored Multimedia

SGI DMF ZeroWatt Disk Solution

In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.

Cray CS300-AC Cluster Supercomputer Air Cooling Technology Video

The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.

SC12 Editorial Feature HPCwire Soundbite sponsored by ISC

HPC Job Bank


Featured Events


  • June 16, 2013 - June 20, 2013
    ISC'13
    Leipzig,
    Germany

  • June 17, 2013 - June 18, 2013
    Forecast 2013
    San Francisco, CA
    United States





HPCwire Events