Brain Surgery for the Masses

By Marc Snir

February 23, 2007

The evolution of 4th generation surgery tools will help spread brain surgery to the masses, altogether dispensing with neurosurgeons in small hospitals that cannot afford their high pay.

Do you feel that I am pulling your leg? I am. But so is the HPCwire editor when he claims that 4th generation programming languages will make HPC programming available to the masses. Programming — at least, programming of a large, complex code — is a specialized task that requires a specialist — a software engineer — to the same extent that brain surgery requires a specialist. You might claim that many non-specialists do write code. It is also true that most of us take care of our routine health problems. But only a fool would try brain surgery because he was successful in removing a corn from his foot. To believe that better languages will soon make software engineers redundant is to believe that Artificial General Intelligence will progress much faster than most of us expect, or to belittle the specialized skills of software engineers.

The editor expects that the masses will program clusters on their own, while software engineers will continue to be needed for programming leading-edge supercomputers. However, the difference is not between bleeding-edge supercomputers and clusters. It is between complex programming tasks and simple programming tasks. Writing a simple program with little concern with performance and not too much worry about correctness is tantamount to removing a corn. Even writing a simple program — for example, an FFT routine that achieves close to optimal performance, is tantamount to brain surgery, even if the target system is the processor that operates my laptop. Writing a moderately complex program that is bug free with high confidence and can be used to control a critical system is also tantamount to brain surgery. Finally, writing a large, complex program that more or less satisfies specifications seems to be harder than brain surgery (large software projects seem to have a higher mortality rate than brain surgery patients).

Programming is harder when the program is more complex and when constraints of high efficiency or high confidence are stricter. Performance constraints can appear on large systems, and can appear on small systems: It can be extremely hard to shoehorn a compute intensive application into the power and memory constraints of a cell phone. Performance can matter a lot for cluster programs that are frequently used: the programmers of the MPI or ScaLAPACK libraries have good reasons to carefully tune the performance of their libraries on clusters: these libraries consume many cycles on many clusters, and improving their performance will improve the performance of many applications. While the difficulty of performance tuning relates to the complexity of the target architecture, one can well argue that a cluster is a more complex architecture than a leading edge supercomputer, because of the more complex software environment and the less controllable behavior of commercial LAN switches.

There is no obvious reason for cluster programs to be smaller or for confidence requirements on clusters to be less stringent than for supercomputers. However, it is true that supercomputer computations are more likely to be resource constrained than cluster computations. Indeed, a program will be run on a capability platform only if it cannot execute in a reasonable time on a smaller cluster. Such programs may tax even the resources of a leading edge supercomputer. On the other hand, performance may be less critical for cluster programs that do not consume significant hardware resources. I am not sure this represents a large fraction of cluster cycles.

The editor draws a dichotomy between MPI with C or Fortran for the high priests of supercomputing and MATLAB or SQL for the masses. This dichotomy is false. The most high-performing commercial transaction systems use SQL, but SQL by itself does not make a commercial application. Such an application will use a variety of services and frameworks, and will be written in a variety of programming languages. SQL itself is written, by experts, in C or other such language.

The same holds true for scientific and engineering computations, be it on clusters or on supercomputers: Whenever possible, users will use available libraries or frameworks. The libraries will be implemented in Fortran, C or such similar languages, and users will use these languages for their glue code. Libraries have been used for many decades to extend the expressiveness of low level programming languages such as Fortran or C.

Computational frameworks are increasingly replacing low level programming languages as the main mechanism for expressing computations in many domains. Such frameworks can be specialized by plugging in specific methods, often written in lower level languages, and can be extended in a variety of ways. I am not sure what the difference between a well-designed computational framework (such as Cactus) and a “fourth generation language” or “domain specific language” is. Such computational frameworks are domain specific. They emphasize higher levels of abstraction, and the execution model is often interactive. Furthermore, computational frameworks are increasingly used for codes that run on the largest supercomputers.

Programming on supercomputers, like programming on any other platform, is likely to evolve toward higher level, more powerful programming languages or frameworks. The use of languages or frameworks that are more extensible, have more powerful type systems with better type inference, and provide support for generic programming, are safer and increase productivity. Such high-level languages are likely to have specialized idioms for specific application domains. To a large extent this is already true for languages such as Java or C#, since much programming is done using powerful domain specific classes. For example, programming a GUI in Java using Swing is very different from programming a business application using Enterprise JavaBeans, and programmers specialize in one or the other. To the same extent, languages such as C# or Java, or next generation languages, can be extended with idioms for scientific computing. This has been done for Java (

The evolution of programming language and compiler technology provides more powerful mechanisms for language extension. The extension mechanisms encompass not just predefined and pre-coded methods. Code generation can occur at run-time or, indeed, whenever new relevant information on characteristics of the computation becomes available. The user can control at various levels the implementation mechanisms for the high-level objects and their methods and even the implementation mechanisms for control structures. The Telescopic Languages project of the late Ken Kennedy or the Fortress language project at Sun are showing the strength of such techniques.

A common thread in these projects is that the high level language should match well the application domain — the way application specialists think. The mapping from the logic of the application to the logic of the machine may involve multiple layers of translation, and these translations cannot be fully automated. A specialist programmer is needed to guide these mappings, by implementing run-time code and libraries, by developing preprocessors and application generators or by adding implementation annotations to the core code. The distinction between the application programmer and the language implementer becomes blurred, since application programmers can modify the language and can modify its implementation. However, such a hierarchical design supports high levels of specialization, where some programmers are more focused on application logic and others are more focused on application performance.

The parallel MATLAB solutions of The MathWorks or ISC are examples of this trend. MATLAB was not developed for HPC, and would not be a viable product if uniquely targeting HPC. The goal was to provide a notation that is closer to the way scientific programmers think. In both cases, the mapping of a MATLAB code to a parallel machine is not fully automated, and the programmer has to manually parallelize the code. Parallelism is expressed using well known (low level) paradigms: message-passing (MPI) and distributed arrays and forall loops (HPF). The parallel notation becomes part of the source code, but it should be possible (and desirable) to keep it separate, as an implementation annotation, and to make sure that it does not change the program semantics.

This general approach to high-level language design, while important for HPC, is not unique to HPC. Indeed, one can well argue that designing high-level languages specifically for high performance computing is a contradiction in terms: High-level languages should match the application domain, not the architecture of the compute platform. Developing high-level languages that satisfy the needs of HPC but are less convenient to use on more modest platforms is a waste of money.

Unique to HPC is the need for low level implementation languages that can be used to write libraries and implement the high-level objects and methods so as to run efficiently on clusters and supercomputers. This implementation language would be, today, MPI with Fortran or C. What should it be tomorrow (i.e., in five years from now)? Could the Partitioned Global Address Space (PGAS) languages, such as UPC, CAF and Titanium fulfill this role? (In a nutshell, these languages provide the same SPMD model of MPI, with multiple processes each executing on its own data. However they also provide partitioned global arrays that can be accessed by all processes. Communication occurs though access to the non-local part of a global array; simple barrier constructs are available for synchronization.)

An “implementation language” (IL) for HPC should satisfy the following requirements:

1. Performance. It should be possible to achieve close to optimal performance for programs written in IL. Recent research has shown that programs written in CAF or UPC can sometimes beat the performance of MPI programs. This is very encouraging given that the compiler technology for these languages is still immature, while implementations of MPI are very mature. There are two reasons to believe that PGAS languages could lead to better performance as compared to MPI: (1) The support by supercomputers and by the interconnect technology used on clusters (Myrinet, Quadrics, InfiniBand) of direct remote memory access entails that better communication performance can be achieved using one-sided puts and gets, rather than two-sided message-passing. The design of MPI is well suited to two-sided communication, but perhaps less suited to one-sided communication. (2) A compiler can optimize communication and avoid the overhead of message-passing libraries, further reducing communication overhead. These languages do not yet offer good support for collective communications, and for parallel I/O, but these problems should be fixed within a few years.

2. Transparency. It should be possible for a programmer to predict, with reasonable accuracy, the performance of a code. The transformation done by the compiler or the run-time should not only preserve the semantics of the code, ensuring that the computation is correct, but should also “preserve” performance, i.e., should support a simple formula for translating program execution metrics into an approximate execution time. ILs are used by programmers to deal with performance issues, but if the programmer has no way of reasoning about performance trade-offs, then performance can be achieved only through an exhaustive search through all possible program versions. PGAS languages are reasonably transparent.

3. User control. The IL should provide the programmer means of controlling how critical resources are used. In particular, for HPC it is important to exercise some control on scheduling (to achieve load balancing and prevent idle time) and on communication. Load balancing and locality (communication reduction) are often algorithmic problems. Without some control on those, one cannot achieve close to optimal performance. Scheduling and communication are under user control with PGAS languages.

4. Modularity and composability. A large application will be composed of independently developed modules. The internal details of one module should not impact other modules, and one should be able to compose modules with limited knowledge of their interface. Sequential programs support only “sequential composition”: a program invokes a module, and control is transferred to that module; upon completion control is transferred back. Programmers have been warned to avoid side effects, leading to a simple interface specification. Parallel programming also requires support for “parallel composition”, or “fork-merge”: several modules execute concurrently, and then combine back into a unified parallel computation. This is essential, for example, in multiphysics simulations, where multiple physics modules work in parallel and periodically exchange information. MPI supports fork-merge via its Communicators: a group of processes can be split into independent subgroups, and then merged back. The code executed by each subgroup is totally independent of the code executed by other subgroups. UPC and CAF have not yet implemented similar concepts and, hence lack good support for modularity. (The CAF community seems to be working on this problem as part of the Fortran 2008 standard effort.)

5. Backward compatibility. Code written in IL should be able to invoke libraries written using MPI or other common message passing interfaces. While this has not been a focus of CAF or UPC, there are no inherent obstacles to compatibility.

There is another set of properties that I believe are important and can be supported efficiently. Their efficient support, however, is still a matter for research. The properties are:

1. Determinism. Deterministic, repeatable execution should be the default. Nondeterminism should occur only if the programmer explicitly uses nondeterministic constructs. Races and synchronization bugs are hard to detect, and are one of the major difficulties of parallel programming. The use of global address space worsens the problem as it becomes easier to write buggy code and harder to detect the bugs.

Transactions and transactional memory are not a solution to this problem. Transactional memory provides efficient mechanisms to ensure the atomicity of transactions, but does not enforce an order between two transactions that access the same data. Transactions are a natural idiom to express the behavior of systems where concurrency is inherent in the problem specification. An online transaction system has to handle concurrent purchasing requests and has to ensure that only one passenger gets the last seat in a plane and that the seat is assigned to the same customer whose credit card was charged — hence atomicity. Transactions are not a natural idiom for most of scientific computing. It is seldom the case that we specify a computation with two conflicting noncommutative updates, where we do not care about their execution order, as long as each executes atomically. The natural idiom for scientific computing is (partial) order, not mutual exclusion. Therefore, races and nondeterminism result most often from programming bugs. The current PGAS languages do not prevent and do not detect races. I believe that race prevention is as essential to parallel programming as memory safety is to sequential programming. Furthermore, it seems plausible that races can be prevented using suitable programming languages and suitable compiler technology, without encumbering the programmer or significantly slowing down execution. We should work hard to ensure this happens, before “race exploits” become daily occurrences.

2. Global name space. A very common idiom in scientific computing is that of a global data structure (e.g., a mesh) that is used to represent the discretization of a continuous field. A simulation step may consist of applying an updating function to this field, or computing the interactions between the field and a set of particles. On a parallel machine one needs to break the structure into patches that are allocated to individual processes, but the patches are not natural objects in the problem definition. They appear only because of the mapping to a parallel system.

Similarly, in a particle computation, it may be necessary to partition the particles into chunks in order to reduce communication and synchronization. While each particle is a natural object in the problem specification, the chunks are not. In both cases, it is more convenient to specify the logic of the computation using global data structures and a global name space. It is desirable to be able to refine such a program and partition data structures without having to change the names of the variables. The name of a variable should relate to its logical role, not to its physical location. (I, therefore, speak of a global name space, not a global address space.)

In order to control communication and parallelism, the user should be able to control where data is located. But this should not require changing the names of variables. PGAS languages do provide a global name space, but support only simple, static partitions of arrays. In cases where more complex or more dynamic partitions of global data structures are needed, one needs to explicitly copy and permute data, and change the names of variables.

3. Dynamic data partitioning and dynamic control partitioning. Parallelism is expressed using two main idioms: data parallelism and control parallelism. In data parallelism, data is partitioned. Execution gets partitioned by executing statements on the site where their main operands reside. This is done, implicitly, with languages such as HPF and the “owner compute” rule, and explicitly, with forall statements and “on” clauses. In control parallelism, control is partitioned and data is moved implicitly to where it is accessed. Both forms of parallelism are useful. (As an aside: the two are identical in single-assignment languages, such as NESL.)

The use of adaptive algorithms, such as Adaptive Mesh Refinement, or multiscale algorithms, require that partitions be dynamic, as data structures change and the amounts of storage and work associated with a patch change. Current PGAS languages do not support dynamic repartitioning of control and data any better than MPI. Such repartitioning will require explicit copying of data and the application then has to maintain the correspondence between the logical name of a variable and its physical location. Dynamic control partitioning is easy for languages such as OpenMP that use a global name space and parallel loops for parallel control. But such languages do not provide good control for locality.

Efficient support for dynamic data and control partitioning is still a research issue. Languages with limited, static partitions (such as current PGAS languages) can be implemented efficiently, but force the user to do the work. Languages that support powerful, dynamic data and control repartitioning can too easily lead to inefficient codes. One limited but well-tested and fairly powerful step toward supporting dynamic data and control partitioning is to use process virtualization. The model provided by MPI or by the PGAS languages is that of a fixed number of processes, each with its own address space, and (usually) one thread of control. Implementations associate one process with each processor (or core) and applications are written assuming a dedicated fixed set of identical processors. A suitable run-time can be used to virtualize the processes of MPI, UPC or CAF (the AMPI system is already doing this for MPI). The run-time scheduler can map multiple virtual processes (that are actually implemented as user-level threads) onto each physical processor, and can dynamically migrate the processes and change the mapping so as to balance load or reduce communication.

Process virtualization greatly enhances the modularity of complex parallel codes. Consider, for example, a multiphysics code that couples two physics modules. Normally, each module runs on a dedicated set of processors. The modules execute independently a time step of their simulation, and then exchange data. Suppose that the first module executes a dynamic mesh refinement. The internal logic of this module presumably includes code for repartitioning the mesh and rebalancing the computation when the mesh is refined. But, after the refinement, this module will take longer to execute a time step, so that the global computation becomes unbalanced. It becomes necessary to steal resources from the second module in order to rebalance the computation. This other module may not have, on its own, any need for dynamic load balancing, and very few parallel programs are written so as to accommodate a run-time change in the number of processors they use. With virtual processes, each module may be written for a fixed number of (virtual) processes, while still allowing resources to be moved from one module to another in a multiphysics computation.

Similarly, consider a multiscale computation, where it may be necessary to spawn a new parallel module that refines the computation in one region, using a finer scale, more compute intensive method. With virtual processes, resources can be reallocated within a fixed partition to the spawned module.

In summary, PGAS languages may, with some needed enhancements, be quite useful as HPC implementation languages. Additional work is needed for such languages to support modern scientific codes — work that, unfortunately, does not seem to be part of the DARPA HPCS agenda.
My discussion, so far, has focused on programming languages. However, it is important to remember that programming languages are only one of many contributors to programmer productivity — not the most important one, and not very significant, in isolation. Research on the productivity of object oriented languages has shown that the use of OO languages does not contribute much to productivity, per se. Rather, OO languages contribute indirectly in that they encourage and facilitate code reuse and other useful programming techniques. It would be useful to submit newly proposed programming languages for HPC to that same test: In what way do they support more efficient software development processes?

By far, the most important contributor to software productivity is the quality and experience of the software developer. This, by itself, already suggests that “parallel programming for the masses” is misguided. One should not attempt to develop languages and tools so that Joe Schmo is able to program clusters or supercomputers. Rather, one should educate high quality software engineers that understand programming for HPC, and provide enough rewards and stability to ensure that they stay in their profession and amass experience.

Software productivity is also heavily influenced by the quality of the process used to develop the software and by the quality of the tools and environments used by the software developers. It is important to understand what best practices in the development of HPC software are, and to ensure that these practices are broadly applied. While much of the knowledge from general software development will apply, scientific computing may need different testing and validation processes, and HPC computing may need a different emphasis and a different approach to performance tuning. One can hope that the DARPA HPCS program will result in advances in this area.

HPC software developers have traditionally used programming environments and tools that lagged behind those used in commercial software development. The HPC market has been too small to justify commercial investments in high quality HPC Integrated Development Environments (IDEs), and the government has not had the vision to support such development. Eclipse, the open source IDE framework that is now broadly used for Java development, offers a promise for change. Eclipse based IDEs for Java are as good or better than any, and the open architecture of Eclipse supports the construction of IDEs for other languages and programming models. It has become possible to have a community effort that will create a modern, high-quality IDE for HPC. This work is already happening in national labs and universities.

One major contributor to the productivity of software developers is the availability of significant compute resources, so as to shorten the edit-compile-test cycle. The limited availability of interactive HPC platforms may be one of the most significant impediments to HPC software development. One should carefully weigh the right balance between the allocation of resources to production and the allocation to development. And one should ensure that HPC software development does not remain stuck in the era of batch processing.
In summary, there is no magic wand that will make software development for clusters or supercomputers significantly easier than it is now — to the same extent that no magic wand will make brain surgery significantly easier. The technology used in brain surgery continues to improve, enabling brain surgeons to perform more complicated surgeries, and improving the prognoses of brain surgeries. To the same extent, when we think of programming languages or tools that will enhance the productivity of HPC programmers, it is not very useful to focus on “HPC programming for dummies.” Rather, one should focus on better languages and tools for the HPC experts that will enable these experts to develop more complex or better performing software for HPC platforms.


Professor Marc Snir is the head of the Computer Science Department at the University of Illinois at Urbana-Champaign. He is currently pursuing research on parallel programming languages and environments, parallel programming patterns, and performance tuning patterns. He is also involved in the DOE funded Center for Programming Models for Scalable Parallel Computing. For more biographical information visit

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Penguin Computing Brings Cascade Lake-AP to OCP Form Factor

July 7, 2020

Penguin Computing, a subsidiary of SMART Global Holdings, Inc., is announcing a new Tundra server, Tundra AP, that is the first to implement the Intel Xeon Scalable 9200 series processors (codenamed Cascade Lake-AP) in t Read more…

By Tiffany Trader

Google Cloud Debuts 16-GPU Ampere A100 Instances

July 7, 2020

On the heels of the Nvidia's Ampere A100 GPU launch in May, Google Cloud is announcing alpha availability of the A100 "Accelerator Optimized" VM A2 instance family on Google Compute Engine. The instances are powered by t Read more…

By Tiffany Trader

Q&A: HLRS’s Bastian Koller Tackles HPC and Industry in Germany and Europe

July 6, 2020

HPCwire: Let's start with HLRS and work our way up to the European scale. HLRS has stood out in the HPC world for its support of both scientific and industrial research. Can you discuss key developments in recent years? Read more…

By Steve Conway, Hyperion

The Barcelona Supercomputing Center Offers a Virtual Tour of Its MareNostrum Supercomputer

July 6, 2020

With the COVID-19 pandemic continuing to threaten the world and disrupt normal operations, facility tours remain a little difficult to operate, with many supercomputing centers having shuttered facility tours for visitor Read more…

By Oliver Peckham

What’s New in Computing vs. COVID-19: Fugaku, Congress, De Novo Design & More

July 2, 2020

Supercomputing, big data and artificial intelligence are crucial tools in the fight against the coronavirus pandemic. Around the world, researchers, corporations and governments are urgently devoting their computing reso Read more…

By Oliver Peckham

AWS Solution Channel

Maxar Builds HPC on AWS to Deliver Forecasts 58% Faster Than Weather Supercomputer

When weather threatens drilling rigs, refineries, and other energy facilities, oil and gas companies want to move fast to protect personnel and equipment. And for firms that trade commodity shares in oil, precious metals, crops, and livestock, the weather can significantly impact their buy-sell decisions. Read more…

Intel® HPC + AI Pavilion

Supercomputing the Pandemic: Scientific Community Tackles COVID-19 from Multiple Perspectives

Since their inception, supercomputers have taken on the biggest, most complex, and most data-intensive computing challenges—from confirming Einstein’s theories about gravitational waves to predicting the impacts of climate change. Read more…

OpenPOWER Reboot – New Director, New Silicon Partners, Leveraging Linux Foundation Connections

July 2, 2020

Earlier this week the OpenPOWER Foundation announced the contribution of IBM’s A21 Power processor core design to the open source community. Roughly this time last year, IBM announced open sourcing its Power instructio Read more…

By John Russell

Google Cloud Debuts 16-GPU Ampere A100 Instances

July 7, 2020

On the heels of the Nvidia's Ampere A100 GPU launch in May, Google Cloud is announcing alpha availability of the A100 "Accelerator Optimized" VM A2 instance fam Read more…

By Tiffany Trader

Q&A: HLRS’s Bastian Koller Tackles HPC and Industry in Germany and Europe

July 6, 2020

HPCwire: Let's start with HLRS and work our way up to the European scale. HLRS has stood out in the HPC world for its support of both scientific and industrial Read more…

By Steve Conway, Hyperion

OpenPOWER Reboot – New Director, New Silicon Partners, Leveraging Linux Foundation Connections

July 2, 2020

Earlier this week the OpenPOWER Foundation announced the contribution of IBM’s A21 Power processor core design to the open source community. Roughly this time Read more…

By John Russell

Hyperion Forecast – Headwinds in 2020 Won’t Stifle Cloud HPC Adoption or Arm’s Rise

June 30, 2020

The semiannual taking of HPC’s pulse by Hyperion Research – late fall at SC and early summer at ISC – is a much-watched indicator of things come. This yea Read more…

By John Russell

Racism and HPC: a Special Podcast

June 29, 2020

Promoting greater diversity in HPC is a much-discussed goal and ostensibly a long-sought goal in HPC. Yet it seems clear HPC is far from achieving this goal. Re Read more…

Top500 Trends: Movement on Top, but Record Low Turnover

June 25, 2020

The 55th installment of the Top500 list saw strong activity in the leadership segment with four new systems in the top ten and a crowning achievement from the f Read more…

By Tiffany Trader

ISC 2020 Keynote: Hope for the Future, Praise for Fugaku and HPC’s Pandemic Response

June 24, 2020

In stark contrast to past years Thomas Sterling’s ISC20 keynote today struck a more somber note with the COVID-19 pandemic as the central character in Sterling’s annual review of worldwide trends in HPC. Better known for his engaging manner and occasional willingness to poke prickly egos, Sterling instead strode through the numbing statistics associated... Read more…

By John Russell

ISC 2020’s Student Cluster Competition Winners Announced

June 24, 2020

Normally, the Student Cluster Competition involves teams of students building real computing clusters on the show floors of major supercomputer conferences and Read more…

By Oliver Peckham

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

[email protected] Turns Its Massive Crowdsourced Computer Network Against COVID-19

March 16, 2020

For gamers, fighting against a global crisis is usually pure fantasy – but now, it’s looking more like a reality. As supercomputers around the world spin up Read more…

By Oliver Peckham

[email protected] Rallies a Legion of Computers Against the Coronavirus

March 24, 2020

Last week, we highlighted [email protected], a massive, crowdsourced computer network that has turned its resources against the coronavirus pandemic sweeping the globe – but [email protected] isn’t the only game in town. The internet is buzzing with crowdsourced computing... Read more…

By Oliver Peckham

Global Supercomputing Is Mobilizing Against COVID-19

March 12, 2020

Tech has been taking some heavy losses from the coronavirus pandemic. Global supply chains have been disrupted, virtually every major tech conference taking place over the next few months has been canceled... Read more…

By Oliver Peckham

Supercomputer Simulations Reveal the Fate of the Neanderthals

May 25, 2020

For hundreds of thousands of years, neanderthals roamed the planet, eventually (almost 50,000 years ago) giving way to homo sapiens, which quickly became the do Read more…

By Oliver Peckham

DoE Expands on Role of COVID-19 Supercomputing Consortium

March 25, 2020

After announcing the launch of the COVID-19 High Performance Computing Consortium on Sunday, the Department of Energy yesterday provided more details on its sco Read more…

By John Russell

Steve Scott Lays Out HPE-Cray Blended Product Roadmap

March 11, 2020

Last week, the day before the El Capitan processor disclosures were made at HPE's new headquarters in San Jose, Steve Scott (CTO for HPC & AI at HPE, and former Cray CTO) was on-hand at the Rice Oil & Gas HPC conference in Houston. He was there to discuss the HPE-Cray transition and blended roadmap, as well as his favorite topic, Cray's eighth-gen networking technology, Slingshot. Read more…

By Tiffany Trader

Honeywell’s Big Bet on Trapped Ion Quantum Computing

April 7, 2020

Honeywell doesn’t spring to mind when thinking of quantum computing pioneers, but a decade ago the high-tech conglomerate better known for its control systems waded deliberately into the then calmer quantum computing (QC) waters. Fast forward to March when Honeywell announced plans to introduce an ion trap-based quantum computer whose ‘performance’ would... Read more…

By John Russell

Leading Solution Providers


Neocortex Will Be First-of-Its-Kind 800,000-Core AI Supercomputer

June 9, 2020

Pittsburgh Supercomputing Center (PSC - a joint research organization of Carnegie Mellon University and the University of Pittsburgh) has won a $5 million award Read more…

By Tiffany Trader

‘Billion Molecules Against COVID-19’ Challenge to Launch with Massive Supercomputing Support

April 22, 2020

Around the world, supercomputing centers have spun up and opened their doors for COVID-19 research in what may be the most unified supercomputing effort in hist Read more…

By Oliver Peckham

Nvidia’s Ampere A100 GPU: Up to 2.5X the HPC, 20X the AI

May 14, 2020

Nvidia's first Ampere-based graphics card, the A100 GPU, packs a whopping 54 billion transistors on 826mm2 of silicon, making it the world's largest seven-nanom Read more…

By Tiffany Trader

Australian Researchers Break All-Time Internet Speed Record

May 26, 2020

If you’ve been stuck at home for the last few months, you’ve probably become more attuned to the quality (or lack thereof) of your internet connection. Even Read more…

By Oliver Peckham

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

15 Slides on Programming Aurora and Exascale Systems

May 7, 2020

Sometime in 2021, Aurora, the first planned U.S. exascale system, is scheduled to be fired up at Argonne National Laboratory. Cray (now HPE) and Intel are the k Read more…

By John Russell

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

TACC Supercomputers Run Simulations Illuminating COVID-19, DNA Replication

March 19, 2020

As supercomputers around the world spin up to combat the coronavirus, the Texas Advanced Computing Center (TACC) is announcing results that may help to illumina Read more…

By Staff report

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This