People to Watch 2023 – Mateo Valero

Mateo Valero

Director, Barcelona Supercomputing Center; Professor, UPC (Technical University of Catalonia)

Congratulations on your selection as a 2023 HPCwire Person to Watch. With MareNostrum 5 slated for imminent arrival, how are you looking forward to the new system transforming work at BSC?

The significant boost in performance for a capability system, from 13 PF peak to 314 PF and its storage volumes with 240 PB disk space and 400 PB on tape, are the major advantages of the new system.

BSC is a research center and an HPC service center, MareNostrum5 will transform both. On one hand, our research staff made up of more than 600 people in the Computer Science, Life Science, Earth Science and Engineering Science domains will be able to better address today’s societal challenges. Some examples are precision medicine through the human body digital twin, climate change through our strong participation in the Destination Earth digital twin, as well as Energy through simulations for combustion, fusion and wind power. We expect the research which will be done at BSC, in cooperation with scientists worldwide, using MareNostrum 5 will make the Digital twin era a reality.

In addition, MareNostrum5 will be part of the EuroHPC network, and will provide services to the best European researchers, through peer review, to the best scientific, industrial, and public administration challenges access. BSC has been providing access to researchers from all over Europe since 2010, and with the increased performances and capacities, we are expecting the users and their applications to take full advantage of the system’s capacities. We will devote significant effort to tuning both the applications and the system, using our experience and tools to understand their behavior.

The MareNostrum systems have historically angled more towards experimental technology, which we understand is also the case for MareNostrum 5. What led to this emphasis, and what technologies will be tested using MareNostrum 5?

For many years, we have focused on providing the best services to our users, and also to enable the advancement of technology and science. We always try to make the most appropriate technology available to our users, and to build future knowledge based on new technologies. We did this for Cell, ARM, KNL, NVIDIA, and AMD, working on the system software, the software stack, and applications. The advantage of this approach is when a new technology becomes a product, our researchers are ready to go with their applications as soon as the system comes into production.

For this reason, we have a set of benchmarks and corresponding data sets, for system selection. Of course, it includes HPL as this significantly stresses many of the different system components, such as memory, processor, and networking. However, our goal in selecting a machine is not the position we achieve in the Top500. For example, in 2017 we decided to build MareNostrum 4 with 4 partitions: an Intel based one with 11 plus PF peak for General Purpose, for a Power9+V100 similar to Sierra at LLNL with 1,5 PF peak for AI workload, a little Fugaku with 0,5 PF (half a rack), and AMD with MI50, a predecessor of Frontier with another 0,5 PF. With the same investment, and only with a GPU partition, the system would have been in the top 5, but we considered at that time that most users were not ready for such a configuration.

On MareNostrum5, the production clusters are based on Intel Sapphire rapids for the General Purpose, an accelerated Intel Sapphire Rapids with 4 NVIDIA Hopper GPUs, and the new technologies including NVIDIA Grace and Intel Rialto Bridge. In the near future, we are looking to repeat the same process as in the past with these new technologies, as well as others like, RISC-V accelerator and Quantum Computing based systems.

BSC has close ties with Intel and RISC-V, which complements other EuroHPC efforts that are more closely aligned with EPI. Could you talk about the partnership with Intel in the context of the European ecosystem and BSC’s work?

BSC has been advocating the use of RISC-V in Europe for almost a decade. With the closed nature of other ISAs and the lack of a European alternative, it became more clear in the last 5 years that RISC-V was the only option that Europe could leverage for autonomy. Likewise, as an open standard ISA, RISC-V is an excellent vehicle to enable hardware/software co-design for everyone, not just a few very large companies. From a co-design research perspective, RISC-V enables us to build real systems, resulting in rapid technology transfer due to the demonstrated results. By providing a common language between research and industry, RISC-V and the resulting implementations expand the ecosystem with guaranteed software compatibility, assuming you support the standard. The potential impact of RISC-V was so large that in 2019, we built the Laboratory for Open Computer Architecture (LOCA) as a mechanism to enable the RISC-V ecosystem development at BSC and worldwide. Now, in Europe, we are seeing the rapid proliferation of RISC-V in the embedded industry and research with projects like EPI, EUPILOT, MEEP, eProcessor, ISOLDE, TRISTAN, and others. Europe, through EuroHPC JU and KDT JU has aligned on supporting the RISC-V ecosystem and developing several roadmaps that will lead to a vibrant RISC-V ecosystem. Even though RISC-V has gained significant traction in the IoT and embedded space, we believe RISC-V can conquer the HPC domain as well. We founded the Special Interest Group for HPC in RISC-V International to enable the RISC-V ISA and ecosystem to also support HPC, working from the top down.

Both Intel and BSC have made a strong commitment to the RISC-V ecosystem. BSC has a long history of doing research with Intel. As we look to the future and define the next steps, it is only natural to continue research with Intel using the common language of RISC-V. We look forward to working with Intel to make RISC-V solutions widely available for HPC all the way down to IoT. For Europe, as Intel invests more in Europe, it will provide an opportunity for chips to be designed and made in Europe for HPC to IoT. RISC-V and Intel provide two pillars that can support a new European semiconductor industry for HPC.

The traditional HPC market is undergoing substantial change, most notably blending in AI technologies with quantum, possibly, on the horizon. Where do you see HPC headed? What trends – and in particular emerging trends – do you find most notable? Any areas you are concerned about, or identify as in need of more attention/investment?

We have reached a technology convergence point where energy efficiency and silicon fabrication costs are moving us away from general purpose computing solutions to more energy efficient, specialized computing solutions for HPC. Chiplet technology and System in Package advance integration is also lowering the cost to entry to build these HPC systems. Thus, it becomes economically viable to support more different applications with Application Specific Integrated Circuits (ASICs). Given the heterogeneity of these future systems, system interfaces and data movement and management will require significant attention.

Energy efficiency continues to be a significant problem for HPC and AI. This is one of my main concerns for future systems. Furthermore, we need to refocus our attention on real workloads vs. benchmarks to produce systems with significant application improvements generation to generation, especially for workloads that have sparse datasets. System efficiency for sparse workloads in the single digits must be addressed. Overall, our ability to consume compute far exceeds our ability to build systems that meet these compute requirements, for example, AI compute requirements continue to double every 4-6 months. Energy and system efficiency are the most significant HPC challenge and this becomes even more complex when we combine the current trends towards heterogeneous systems, their interfaces and optimizing/minimizing data movement.

The future is bright! There are a variety of emerging technologies that can help address the compute requirements in different domains. This creates an HPC future composed of heterogeneous systems that can incorporate a variety of tools, everything from general purpose computing to a variety of accelerators, including quantum, neuromorphic, FPGAs, and custom ASICs. Data movement is the new optimization target with the focus in the next generation of HPC moving away from compute to data at rest and data in motion combined with compute. We have moved past the AI inflection point where AI has clearly demonstrated its utility in almost every domain to improve, impact, or optimize the outcomes. AI can help to build application understanding, create new algorithms, enable data-centric computer architectures, and much more. Furthermore, AI has the ability to impact many different domains and ideally help optimize the algorithms down to the architecture. Like AI, Quantum Computing has changed how we think about certain problem domains. It has made what was impossible before, possible. It is an excellent tool to add to the HPC tool box. That is the most exciting part about being involved in HPC. We are investing in technologies that unlock new discoveries and capabilities for the world.

What inspired you to pursue a career in STEM and what advice would you give to young people wishing to follow in your footsteps?

When I was young, I loved mathematics and I was good at it. However, in Spain at that time they said that the only career opportunity for mathematicians was to be high school teachers. I had good grades and so I was encouraged to do engineering. I chose telecommunications (partly because I don’t know how to draw!). At that time there was only one school of communication engineering in Spain, and there was no computer science school. But in the fifth year of my degree, I was able to learn about circuits and I did my final degree project, making a microprocessor controller for an analog tape, that is, all the hardware and the software, so that you could read and write on an analog tape. So, when I came here to Barcelona, I was lucky to meet a professor named Professor Thomas Lang who inspired me to get into computer architecture. From then on, I have dedicated my life to doing research into the design of high-performance computers.

If I had to give advice to young people, I would say that the most important thing is to have a solid base – that could be in mathematics or in physics or other basic sciences – be prepared to work hard and be curious.

To use a biological metaphor, if you have these three things, like a tree with strong roots full of life and wanting to grow, you can graft new branches on, like I did, grafting on computer science and computer architecture to my base of mathematics and telecommunications.

Outside of the professional sphere, what can you tell us about yourself – unique hobbies, favorite places, etc.? Is there anything about you your colleagues might be surprised to learn?

My hobby is to work a lot, and not to stop! I love Barcelona (the most beautiful city in the world) and the Barça, the football team too, and watching their games with my friends from the neighborhood. I am also a great fan of Mexico, which I consider as my second home. I’ve been there over 100 times giving conferences on supercomputing and trying to promote HPC related activities there. I love the people, the history, the food, the music and the tequila! I am also a keen reader.

People to Watch 2023

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

HPCwire