With 2020 underway, we’re looking to the future of high performance computing and the milestones that are growing ever closer. Every year, HPCwire names its annual list of People to Watch to foster a dialogue about our industry and give our readers a personal look at the hard work, dedication, and contributions from some of the best and brightest minds in HPC. These research efforts, accomplishments and technologies are shaping our future, and these are the people who are making it happen.
We present the HPCwire People to Watch 2020:
HPCwire: Please update us on market reception to the micro-Fugaku boards and the new systems: PRIMEHPC FX1000 and PRIMEHPC FX700. We saw there was a lot of excitement and anticipation at SC20.
As we announced recently, we received an order of PRIMEHPC FX1000 from Nagoya University. Our new Arm supercomputers FX1000 and FX700 have attracted much attention because of their performance as well as memory bandwidth and low power consumption.
HPCwire: What is the implication of the A64FX prototype being #1 on the Green500? What factors account for a homogenous Arm system rising to the top of the Green500 where heterogeneous (mostly GPU-based) systems were dominating?
We optimized logics and usage of SRAM for low power consumption. Si technology of 7nm FinFET and direct water cooling are also very effective for energy efficiency. A64FX supersedes GPU systems even though the HPL used in Green500/TOP500 is one of the FP intensive benchmark programs in which GPU can perform effectively as well, more than 80% calculation efficiency for example. This leads to the fact that A64FX homogenous system can run “real” applications with more energy efficiency because GPU’s energy efficiency will not be fully utilized for those applications.
HPCwire: What’s your perspective on the relationship between HPC and AI? What can you tell us about the role of AI/ML/DL, of GPUs and other advanced, accelerated technologies, in the Fujitsu strategy?
AI is big part of HPC, and collaboration of AI and traditional HPC, simulation for example, is also getting important. There is a variety of possible combinations. The A64FX is very good at both AI and traditional HPC workloads, because of its high memory bandwidth and the SVE’s multi precision SIMD functions. Fujitsu also provides x86 GPGPU clusters and solutions on them. We are intensively studying system architectures employing domain specific processors such as annealing for customer solutions. However, for a while, strong general-purpose CPU technology is essential for a wide area of applications.
HPCwire: Can you reflect on the promise and potential for accelerated Arm, combining an Arm processor with GPUs or with FPGAs? Is that something you are exploring?
As we care about the standards in our design, you can use A64FX’s PCIe connectivity for any purpose including accelerators. Design space is not limited. A lot of ideas and combinations should be considered and evaluated for future processors because Moore’s law is speeding down obviously. We should be careful with the tradeoff between the efficiency and flexibility of configurations because applications for such accelerators are still quite limited. One of the other discussions may be the difficulties in programming and tuning in heterogeneous architectures. A64FX users can enjoy GPU-class performance and efficiency with superior productivity by the homogeneous programming model.
HPCwire: Generally speaking, what trends and/or technologies in high-performance computing do you see as particularly relevant for the next five years?
Performance increase continues because the demand is concrete. While providing the leading edge ‘traditional’ supercomputer technology, we are preparing to use emerging technologies. However, I’m not sure our customers can utilize them in their business for that time frame.
HPCwire: Outside of the professional sphere, what can you tell us about yourself – personal life, family, background, hobbies, etc.? Is there anything about you your colleagues might be surprised to learn?
I go cycling in weekend for more than ten years. Climbing a small hill make my energy exhausted. This intensive and concentrated workout is precious after grown up. My children are getting independent, now my wife and I are enjoying our life by traveling abroad for example.
HPCwire: More on Toshiyuki
Mr. Toshiyuki Shimizu is Senior Director of Platform Development Unit, at Fujitsu Limited. Mr. Shimizu started at Fujitsu Laboratories with the research and development of the AP1000 massively parallel supercomputer system. His primary research interest is in interconnect architecture, most recently culminating in the development of the Tofu interconnect for the K computer. He led the development of Fujitsu’s high-end supercomputer PRIMEHPC series and Supercomputer Fugaku. Mr. Shimizu received his Master of Computer Science degree from Tokyo Institute of Technology in 1988.
Dr. Werner Vogels is Chief Technology Officer at Amazon.com where he is responsible for driving the company’s customer-centric technology vision.
As one of the forces behind Amazon’s approach to cloud computing, he is passionate about helping young businesses reach global scale, and transforming enterprises into fast-moving digital organizations.
Vogels joined Amazon in 2004 from Cornell University where he was a distributed systems researcher. He has held technology leadership positions in companies that handle the transition of academic technology into industry. Vogels holds a PhD from the Vrije Universiteit in Amsterdam and has authored many articles on distributed systems technologies for enterprise computing
Prith Banerjee is Chief Technology Officer at ANSYS, a leader in engineering simulation. In this role, he leads the evolution of ANSYS’ technology and champions the company’s next phase of innovation and growth.
Previously, he was Senior Client Partner at Korn Ferry. Prior to that, he was Executive Vice President and Chief Technology Officer of Schneider Electric. Formerly, he was Managing Director of Global Technology R&D at Accenture. Earlier, he was Chief Technology Officer and Executive Vice President of ABB. Earlier, he was senior vice president of research at HP and director of HP Labs. Formerly, he was Dean of the College of Engineering at the University of Illinois at Chicago. Formerly he was the Walter P. Murphy Professor and Chairman of ECE at Northwestern University. Prior to that, he was professor of ECE at the University of Illinois.
In 2000, he founded AccelChip which was sold to Xilinx Inc. in 2006. During 2005-2011, he was founder, Chairman and Chief Scientist of BINACHIP.
His research interests are in electronic design automation, and parallel computing, and he is the author of about 350 research papers. He has also supervised 37 Ph.D. students.
Banerjee currently serves on the Board of Directors of Cubic Corporation and Software Motor Company. In the past, he has served on Boards for Cray, Inc., the Anita Borg Institute, the Computer Science Board of the National Academy of Engineering and the Technical Advisory Boards of Ambit, Atrenta , Calypto, and Cypress.
He was listed in the FastCompany list of 100 top business leaders in 2009.
He is a fellow of the AAAS, ACM and IEEE and a recipient of the 1996 ASEE Terman Award and the 1987 NSF Presidential Young Investigator Award.
He received a B.Tech. (President’s Gold Medalist) in electronics engineering from the Indian Institute of Technology, Kharagpur, and an M.S. and Ph.D. in electrical engineering from the University of Illinois, Urbana.
HPCwire: Hi Evan – can you tell us about your role at Azure, your areas of responsibility, what is most challenging and most fun?
Hello! I’m a Principal Program Manager in the Azure Specialized Compute team, where I lead our HPC-optimized Virtual Machine programs. My role is to drive the technical architecture, roadmap, and strategy of our H-series VMs optimized for massively scalable workloads. More practically speaking, my main responsibility is to ensure everything we’re thinking about and doing for the H-series programs aligns to a North Star of the best HPC performance, price/performance, and scalability we can possibly deliver to our customers. The most challenging aspect of my role here is maximizing impact across a global organization with a global customer base. That’s different from how I operated while working in the Private Sector Program at the National Center for Supercomputing Applications (NCSA), where my colleagues and even the customers were localized. Affecting the right change and outcomes in a global sense is quite challenging but also deeply rewarding. I love seeing other people and organizations succeed in their life’s work in part because of something we did right on Azure. The most fun aspect of my job is working creatively on advanced technology with a large group of diverse, passionate, and incredibly smart people. I’m constantly impressed by the unrivaled talent within the Azure team. Our culture is focused on innovation and supporting each other in that pursuit. It makes coming to work each day a pleasure.
HPCwire: In the last couple years, tier one cloud providers have intensified their adoption of HPC technologies, narrowing the gap between on-prem and cloud for HPC end users. What’s behind this and how does Azure differentiate itself as an HPC provider?
With time, cloud providers are better understanding the needs and impact of more workloads, including those leveraging HPC. Azure’s main differentiation in this space is that our HPC products and services are laser focused on the needs of this type of customer. For example, we’ve been very intentional about using InfiniBand to enable Azure to support massive scale MPI and RDMA workloads with performance and efficiency that rivals or surpasses leadership-class supercomputers. For example, last year Azure became the first cloud to demonstrate the ability to support a 10,000 core MPI job and later we disclosed that a FORTUNE500 firm had run a 28,000-core seismic processing workload on Azure as well. This kind of capability does two things. First, it enables a small but high-value set of customers to migrate existing large workloads to the cloud. Second, it enables a much larger set of customers who haven’t had access to supercomputers to start running their small or medium sized workloads at much larger scale. This drives faster time-to-solution, or higher resolution in their jobs. Both approaches can fundamentally change how an organization works, operates, and thinks about its innovation agenda.
HPCwire: How do you see the relationship between HPC and AI? Did AI propel HPC technologies in the cloud? Give us your sense of how the HPC cloud market is evolving and what’s finally driving it to the cloud.
I see “HPC” as the overall practice and “AI” as a catch-all term describing a newer set of data science workloads that are the latest in a long line of applications that have benefitted from the most powerful computers humankind has built. For a long time HPC was almost exclusively physics-based modeling and simulation. Eventually other domains came along that could be expressed and run efficiently on HPC environments. Today, AI/ML is the hottest new workload that also allows society to tackle some existing problems in creative ways, and entirely new problems for the first time. AI/ML won’t be the last “new” HPC workload, however. The world has proven pretty adept at finding novel ways of solving tough problems with supercomputers. That diversity is what keeps the HPC space vibrant and exciting. Yes, AI is responsible for a large portion of Tier 1 cloud provider adoption of HPC technologies. Cloud providers can operationalize AI at extraordinary scale to drive innovations into their core products and business. This augments the rationale to be in the supercomputing business, whether it’s providing this capability to internal teams (as our world-class Microsoft Research organization does) or external customers and platform partners.
HPCwire: What new/emerging technologies are you most closely tracking?
I pay attention to advances in packaging techniques and manufacturing capabilities. Silicon can’t be designed the same way anymore, and it’s shown in some vendors slipping their roadmaps, sometimes substantially. Stacking of silicon elements and how they’re all interconnected will be where much of the product innovation occurs. Organizations relying on HPC will need to base their strategies not a vendor’s roadmap slides, but the risk analysis of actual delivery and the cost to do so. Those who can execute against their plans, and those who partner with them, will have a decided advantage. From my perspective, I feel confident in Azure’s ability to execute against its roadmap with the technology collaborators that will contribute to our offerings. I’m also really interested to see how in-network intelligence improves. On Azure we’ve deployed InfiniBand technologies that do things like hardware-based offload MPI collectives, which sound complex but in practice allow customers to run their workloads faster and get better price/performance. The C-level always likes that one! I think this is just the tip of the iceberg and that network technologies will become more application-specific to certain high value areas.
HPCwire: The exascale era is nigh as a system capable of 10^18 (double-precision) petaflops is expected to be fielded within the next couple years. What does the exascale era signify to you and what role will the cloud play in exascale computing?
Weactually entered the exascale era in 2018, when the team from Berkeley Labs and ORNL used Summit to accurately identify weather patterns using deep learning techniques. Different workloads simply need different levels of precision. I don’t think we should implicitly devalue meaningful computational science by not deeming it “exascale” just because the important problem it helped solve didn’t need FP64 math. Doing so is counter to a culture of inclusivity, and implies that workloads needing higher precision are a “universal standard” of sorts. Instead, we should think of the exascale era as a continuum of achievements from low to high precision, and celebrate each milestone along the way. Naturally, the lower precision scenarios have gotten there first. The higher precision scenarios will get there later but of course this computational achievement is in some ways more difficult and that makes it unique, too. I think that would be a better way to honor what is clearly going to be a multitude of “exascale” achievements. There’s no doubt, though, that large cloud providers will be significant players in the exascale era and the eras that follow. Extreme-scale AI training makes 10^18 calculation capabilities not just nice to have, but a necessity when you consider the transformative potential and the competitive advantage it can impart. A good example of this is our work in the field of artificial general intelligence. This has no visible ceiling to the amount of compute required.
HPCwire: Outside of the professional sphere, what can you tell us about yourself – personal life, family, background, hobbies, etc.? Is there anything about you your colleagues might be surprised to learn?
My very patient wife, Katie, and I are parents to a 2-year old son, Calvin. Spending time with them keeps me grounded and reminds me what’s most important. I grew up in North Carolina and went to Duke University as an undergrad, so I’m a die-hard Blue Devil basketball junkie. Something colleagues would be surprised to learn about me?……I can do handstands for long distances and periods of time.
HPCwire: More on Evan
Evan Burness is a Principal Program Manager in Azure Specialized Compute where he leads Azure’s efforts around the HB-series and HC-series Virtual Machines optimized for massive scale high-performance computing. Prior to joining Microsoft he worked at Cycle Computing as its Director of HPC. Before that he spent eight years helping to lead the Private Sector Program at the National Center for Supercomputing Applications. He holds an undergraduate degree in public policy from Duke University, and a graduate degree in business from the University of Illinois.
Dr. Dario Gil is the Director of IBM Research, one of the world’s largest and most influential corporate research labs.
IBM Research is a global organization with over 3,000 researchers at 12 laboratories on six continents advancing the future of computing. Dr. Gil leads innovation efforts at IBM, directing research strategies in Quantum, AI, Hybrid Cloud, Security, Industry Solutions, and Semiconductors and Systems.
Dr. Gil is the 12th Director in its 74-year history.
Prior to his current appointment, Dr. Gil served as Chief Operating Officer of IBM Research and the Vice President of AI and Quantum Computing, areas in which he continues to have broad responsibilities across IBM. Under his leadership, IBM was the first company in the world to build programmable quantum computers and make them universally available through the cloud. An advocate of collaborative research models, he co-chairs the MIT-IBM Watson AI Lab, a pioneering industrial-academic laboratory with a portfolio of more than 50 projects focused on advancing fundamental AI research to the broad benefit of industry and society.
A passionate supporter of scientific discovery and education, Dr. Gil is a Trustee of the New York Hall of Science, which provides schools, families and underserved communities in the New York City area with exposure to science, technology, engineering and math (STEM).
Dr. Gil received his Ph.D. in Electrical Engineering and Computer Science from MIT.
Koduri joined S3 Graphics in 1996. He became the director of advanced technology development at ATI Technologies in 2001. Following Advanced Micro Devices’s 2006 acquisition of ATI, he served as chief technology officer for graphics at AMD until 2009. He then went to Apple Inc., where he worked with graphics hardware, which allowed Apple to transition to high-resolution Retina displays for its Mac computers. He returned to AMD in 2013 as a vice president in Visual Computing, which includes both GPU hardware and software, unlike his pre-2009 role at AMD which only concerned GPU hardware. AMD reorganized its graphics division in 2015, promoting Koduri to the executive level by naming him senior vice president and chief architect of the newly formed Radeon Technologies Group. Under this role, Koduri reported directly to AMD CEO Lisa Su.
Koduri took a three-month break from his job at AMD in September 2017, with the intention to spend time with his family.He resigned from AMD on November 7. Two days later, he joined Intel, a competitor to AMD, as senior vice president of the company’s newly formed Core and Visual Computing Group. Matthew S. Smith of Digital Trends argued that this would feel like a stab in heart for fans of AMD, noting that Koduri was “loved for his confident yet easy-going demeanor” and had become the unofficial face for AMD’s underdog image. In June 2018, Koduri announced Intel’s plans to compete with AMD and Nvidia in discrete graphics processing units, with a planned launch of its first GPU in 2020. In a March 2019 interview with Barron’s, Koduri mentioned Intel’s people and resources as his main reason for leaving AMD, described how he successfully headhunted former AMD and Apple engineer Jim Keller for Intel, and said that his visual computing group at Intel has 4,500 people.
Koduri is an investor and advisor at Makuta VFX, an Indian visual effects company which he compared to Pixar.
Source: Wikipedia
HPCwire: Hi Andrew — Cerebras generated a lot excitement in 2019 with its Wafer-Scale Engine chip but some skepticism too, which probably didn’t surprise you given the grandness and goals of the project. What is the feeling going into 2020 after your CS-1 system announcement at SC19?
No, it didn’t surprise us at all. We set out to do fearless engineering and to solve problems that have been unsolved in the industry to date. When you address 50-60 year old problems in this industry, you have to be ready for people to be both skeptical and surprised. I find there are always people in the industry who have a mouth full of, “it can’t be done,” and, “it will never work,” and frequently, they don’t start companies. They’re often at big companies with long tenures under their belts where they have become accustom to doing things a certain way and that’s okay, but that’s not who we are. We take great pride in solving problems that others can’t solve, that others were afraid to solve, and that others thought couldn’t be solved. Those are the things we’re proud of. We’re proud of our approach to solving previously unsolved problems. The last six months have been very exciting for Cerebras. In August, we announced our Wafer Scale Engine (WSE) chip and in November at SC19 we announced our working systems and customer deployments. With a hard project like this you wait four years and spend a huge amount of money to build and invent technology to get it in the hands of your customer and to see what your customers can do with your vision. It’s interesting to see how our customers are able to layer their vision on top of our products and how they can solve deep learning problems in their various domains, as well as how they can invent new technology by using our CS-1 system.
HPCwire: What are Cerebras’ milestones and goals for 2020?
Our goal is to enable our customers to solve problems that are unsolved in their respective domains. We’ve conducted some initial work on cancer research with Argonne National Lab and with LLNL to bring AI to physics, simulation, etc. These are the things that bring us great pride and joy. We measure ourselves, our milestones and our goals, around making sure that our customers are successful.
HPCwire: What markets are important to Cerebras and how is the company’s product positioned in comparison to other deep learning technologies, like the GPU?
We target four markets. The first is the large complex enterprise customers. This includes pharma and manufacturing, and really the collection of “big dog” enterprise customers who use deep learning. The second is the hyperscalers. The third is the super compute market. The fourth is military and intelligence. We have opportunities and customers in all of those sectors and our product is positioned directly against graphics processing units, with the WSE chip being between 100 and 1,000 times faster. We allow users to take their tensor flow and PyTorch models and run them without modification on our system. Our customers are seeing reductions in training times—from weeks to minutes—and that is very powerful. It allows them to test new ideas at a greater rate, to train more models, and reduces the cost of their curiosity.
HPCwire: Given Cerebras is a systems company, hardware gets a lot of attention and focus, can you highlight some of your software developments?
We are a systems company and hardware gets a lot of focus. We did build the largest chip ever made and not just by a little bit, but by a lot. It is fifty-six times larger than the largest GPU and has three-thousand times more on chip memory. Then we built a system around it. We don’t believe you can be successful as an accelerator chip company. We think you can’t build a Ferrari engine and put it in a Volkswagon and expect Ferrari performance, so we built the entire system. We’re also crystal clear that software is fundamental to make our system easy to use. Software gives our customers insight into what’s happening so that they can improve their models and optimize their algorithms. This allows them to conduct their work more quickly. One of the things we see frequently in this category, is that many deploy 100 or 1,000 GPU’s and that takes months. Then, they modify their models and test. Then, they modify again and that takes seven months. Then they conduct a hyper-parameter search for a month and when they’re finally done they publish and it takes thirty-two minutes to train a model. That’s not true. It actually took eight months and thirty-two minutes. It took all that time to do all of that software work, all the dividing of the model, then spreading it, and parallelizing it—this is very common in the super computing space. A huge amount of the time and effort is put in after the super computer is built. Our software has been designed to avoid this. Customers take their tensor flow model and run it on the equivalent of between 100-1,000 GPU’s by pressing “run”. Our software is designed to provide access to this enormous amount of compute without requiring you to modify your work in order to get the massive performance.
HPCwire: What is your perspective on the relationship and synergy between HPC and AI? What is the role of the CS-1 in this regard?
When it comes to the relationship between HPC and AI, I think the HPC world has been the leader in building very large clusters and have some of the most interesting problems. AI is a tool in their tool kit to attack a certain class of problems. AI is not a silver bullet and it doesn’t solve all problems, but there are classes of problems for which it is the perfect fit. We envision a world in which AI is an element in the HPC workflow. You saw that exact thing already where some work in a super computer is allocated to GPU’s, some work is allocated to CPU’s, and now, as they build these large supercomputers, other work is going to be allocated to AI accelerators. We’re going to have a supercomputer that has internal differentiation in its hardware so the right work is allocated to the right computational engine. We think the CS-1 is the right product for this. If you care about performance and you’re not playing with the biggest of big dogs in the performance world, there’s something out of alignment with your strategy. We are the fastest AI accelerator and, as a result, we chose to begin first with the customers who had the biggest need for speed. Those were the customers with the biggest data and the biggest associated compute and that was in the HPC world.
HPCwire: Having started several high-profile HPC companies, what are your guiding principles as an HPC entrepreneur and innovator?
I have started several companies and we don’t aim at HPC. We’ve always aimed at data centers. We’ve always aimed at building products for other professional users of product. The business of making consumer products isn’t my passion. I love having the best and the brightest as my customers and watching their ideas flourish on top of our ideas. Being Director of IT at my father’s house is arguably one of the most painful roles in my life. My dad calls and says my mother broke the internet by vacuuming, when what she really did was unplug the cable modem. Those aren’t good customers for me. Guys like Rick Stevens are good customers for me. The smartest guys in the pharma market are good customers for me and these are people we enjoy building things for and with. We really like being infrastructure builders. I began my career in the networking industry building switches and routers in the mid 90’s. One of the great joys of being an infrastructure builder is having the idea that if you accelerated communication and dropped the cost of it that good things would happen. Then, 15 years later watching 25-year-olds build WhatsApp having no idea of a world in which communication was expensive and having that change the landscape and the social fabric. Many of us worked hard to drive down the cost of IP networking to $0 so that consumers could text anywhere for free and have free phone calls. I’m actually old enough to remember when it was $4 a minute to call Australia, which is where my family is from. My mom spoke to her mother three minutes a week. It’s laughable now, but that’s what you get with infrastructure. We believe we are creating a compute platform that will enable a whole new type of AI for problems that can’t be worked on today. The goal is to make the problems that can’t be worked on today eventually work so quickly that they can be transformed. When you do that you’re opening the door for those with upper layer ideas with ML applications to solve societal problems. We set out to allow other people to explore issues like cancer, clean water or even car accidents, which could save 40,000 lives every year.
HPCwire: Outside of the professional sphere, what can you tell us about yourself – personal life, family, background, hobbies, etc.? Is there anything about you your colleagues might be surprised to learn?
I met my wife at a tango class, and we continue to dance Argentine tango. I have a rare Hungarian Hound called a Vizsla as a dog and he takes me for long runs. I also picked up guitar in my 50’s and I’m very bad at it. However, I am enjoying it a great deal and it’s a lot of fun!
HPCwire: More on Andrew
Andrew Feldman is founder and CEO of Cerebras Systems, a startup dedicated to accelerating Artificial intelligence (AI) compute. Cerebras is a team of pioneering computer architects, computer scientists, deep learning researchers, and engineers of all types who have come together to build a new class of computer optimized for AI work. Prior to Cerebras, Andrew was founder and CEO of SeaMicro, a pioneer in energy-efficient computation that invented the microserver category and was acquired by AMD for $355 million. Prior to co-founding SeaMicro, Andrew was Vice President of Marketing and Product Management at Force10 Networks (acquired by Dell for $800 Million) and before that was Vice President of Corporate Marketing and Corporate Development for Riverstone Networks (NASDAQ: RSTN) from inception through IPO.
Andrew is passionate about building teams that solve industry transforming problems. He is a sought after advisor to startups, and currently serves on the board of directors at Natron Energy and on the advisory board of more than a dozen startups. Andrew is a frequent keynote speaker and guest lecturer at the Stanford Graduate School of Business. Andrew holds a bachelor’s degree and an MBA from Stanford University.
Amber Huffman is an Intel Fellow and Chief Technologist for Datacenter IP in the Silicon Engineering Group at Intel Corporation. Most recently, she leads the definition of industry leading IP building blocks (including memory and IO) for Intel’s Datacenter products.
A respected authority on storage, memory and IO architecture, Huffman has used her expertise and influence to lead Intel and the storage industry toward the definition and adoption of fast, streamlined, highly power-managed and low-latency storage interfaces. She defined, created and drove the NVMe storage standard. This included forming and chairing the NVM Express (NVMe) Workgroup, a consortium of companies working to define a standardized interface for PCI Express-based solid-state drives. Huffman was lead author and editor on the NVMe specification. She continues to chair the board of directors for the NVMe Workgroup and the Open NAND Flash Interface (ONFI) Workgroup; both groups are coalitions of more than 90 technology companies.
Huffman has devoted her career to IO and memory interfaces since joining Intel in 1998. Her early work focused on Serial ATA (SATA) technology, the storage interface standard implemented in most PCs today. She developed prototypes and began leading and writing portions of the standard, earning a coveted Intel Achievement Award for her work. Huffman led the development of the Advanced Host Controller Interface (AHCI), which remains the standard programming interface for SATA today. Subsequently, she led the technical and industry development of ONFI, which standardizes the NAND Flash memory component interface and enables customers to use Flash from various hardware vendors. As with AHCI and NVMe, she served as lead author and editor on the ONFI industry specification.
Huffman earned a bachelor’s degree in computer engineering from the University of Michigan and a master’s degree in electrical engineering from Stanford University. She has been granted more than 20 patents in storage architecture. Huffman is known as a passionate mentor for technologists, including a strong track record of sponsoring numerous men and women to senior technologist positions within the company.
HPCwire: Hi Christine, congratulations on your selection as a 2020 HPCwire Person to Watch and, more importantly, as General Chair for SC20. Can you talk about your experience with the conference, where it’s been, and where you see it going?
Being a part of the SC planning committee has been an honor of mine for most of the past 20 years, starting with SC2001. Most of my SC positions have paralleled my career with the DoD—I volunteered largely in the technical program side for the first several years and then switched over to the more operational positions including infrastructure, communications, and exhibits. As we approach the conference’s 32nd year, the SC planning committee continues to focus on building a strong, relevant technical program, a vibrant exhibitor space, and a Students@SC program that will reach out to not only established HPC curricula but to the local Atlanta educational community as well. Our attendees often tell us that SC is the premier conference for networking with their peers in the HPC and HPC-related communities, and we are striving to attract a broad spectrum of professionals and students to the conference to ensure that there are quality opportunities to collaborate. Oh, and we plan to have food trucks nearby, too. Collaborating is hard work!
HPCwire: We’ve seen that the theme for SC20 is “More Than HPC.” Why was this theme was chosen and what do you hope to inspire in the community?
For SC20 we want to focus on not just high performance computing itself but what drives it—the problems we’re trying to solve, the questions we’re hoping to answer, and the people and underpinnings that make every supercomputer accessible. We’re renewing a focus on the State of the Practice to give those who work behind the scenes the chance to share their wealth of knowledge with three full days of dedicated State of the Practice sessions. We’re also launching two initiatives to expose students to HPC and include them in the conference: the first is the HPC Immersion program which will provide a chance for conference engagement and involvement in HPC and HPC-related communities to undergraduate students who are traditionally underrepresented in these fields. The second is the HPC in the City program, which will engage leaders, decision-makers, educators, and students in the greater Atlanta area to work together and combine our collective talents and HPC capabilities to address problems that are relevant to the local community. More than HPC is meant to inspire cross-collaboration among all the communities involved in supercomputing, and remind us that our many viewpoints can combine to create some terrific advancements for the world around us.
HPCwire: Can you give us a peek into your world as the DoD Supercomputing Resource Center Director and what your role entails? How has the landscape changed for HPC since you’ve been there?
While I first began working in the DoD High Performance Computing Modernization Program (HPCMP) as a computational engineer in 1997, in 2013 I decided to take a leap into the operational and managerial side of the program as the HPCMP’s Associate Director for HPC Centers. In that role I led 350 people—none of whom worked directly for me—in the management, budgeting, and operation of five DoD Supercomputing Resource Centers (DSRCs) across the country. It was great preparation for the role of SC General Chair, in which I lead a team of over 600 people—none of whom work directly for me, either! In early 2019 I was able to come back to my ‘home’ center at the Navy DSRC as the director. This role has afforded me a bit more time to focus on strategic planning for the DSRC, which is challenging in the landscape of constrained budgets and merging US-based HPC vendors. I’m pleased to report that our most recent strategic decision landed us a Cray Shasta that at 290,304 cores and 12.8 theoretical peak PFLOPs will be the largest system the HPCMP has procured to date, and the first HPCMP system to best the 10 PFLOP threshold. None of this is possible without the extraordinary team at the Navy DSRC, whom I’m honored to lead.
HPCwire: Generally speaking, what trends and/or technologies in high-performance computing do you see as particularly relevant for the next five years? Also, what’s your take on near-term prospects for quantum computing and neuromorphic technologies?
It seems that HPC is always on the edge of an electrifying advancement, but we will soon enter a truly new era of computing. Exascale systems will arrive in the next year or two, opening the door for many innovations in materials science, precision medicine, additive manufacturing, and weather prediction, just to name a few. While exciting, we will also experience a notable increase in complexity, and more than a few anxious pangs as we consider industry’s plans to build the next era of systems. The eventual end of Silicon CMOS shrinking has prompted some to suggest quantum and neuromorphic computing as a logical next step. These technologies, however, only address a narrow set of problems, particularly over the near-term.
HPCwire: What inspired you to pursue a career in STEM and what advice would you give to young people wishing to follow in your footsteps?
The short story is that I was exposed to STEM fields very early. The longer story is that my parents taught physics, computer science, and gifted classes in public school after both were in the co-op program with NASA during the moonshot; my father worked on the space shuttle structure and my mother was a computer programmer. In 1984 she bought us an Apple IIe and started teaching my brother and me how to program at the ages of 9 and 12, and well, here we are. For young people who are interested in STEM today, the standard advice about building a good, strong base of critical thinking skills and problem-solving abilities applies. But I most strongly advocate for learning how to communicate effectively to audiences of all types, because the better you are able to advocate for your project, your knowledge, and your abilities, the better chance you have of making a meaningful impact in your field.
HPCwire: Outside of the professional sphere, what can you tell us about yourself – personal life, family, background, hobbies, etc.? Is there anything about you your colleagues might be surprised to learn?
I grew up in an Army family and have lived in Mississippi most of my life, spending the last twenty years on the Mississippi Gulf Coast. Most colleagues would probably be surprised to know that I enjoy Cajun, Zydeco, and West Coast Swing dancing—they already know that I love a good Mardi Gras parade and am a huge Mississippi State University sports fan. I also enjoy being the wacky aunt to my two darling nephews and regaling friends with stories of the often colorful characters and situations I encounter while traveling.
HPCwire: More on Christine
Christine Cuicchi is director of the Navy Department of Defense Supercomputing Resource Center (Navy DSRC), operated by the Commander, Naval Oceanography and Meteorology Command (CNMOC) on behalf of the DoD High Performance Computing Modernization Program (DoD HPCMP). She leads the DSRC in the operation of $55M of supercomputing capability and the attendant storage, networks, and computational expertise which are available to over 2,500 DoD RDT&E, S&T, and acquisition professionals. Prior to her current position, Cuicchi was the Associate Director for HPC Centers for the DoD HPCMP, responsible for managing five DSRCs with an annual RDT&E budget of $100M and a government and contractor workforce composed of approximately 350 people. She was also responsible for an annual HPC system acquisition process of $50-60M. Cuicchi received both her bachelor’s degree in Aerospace Engineering and her master’s degree in Computational Engineering from Mississippi State University. In 2017 she received the U.S. Department of the Navy’s Meritorious Civilian Service Award.
HPCwire: What is AMD’s position going into 2020? In regards to AMD’s competitive standing in the datacenter server processor market, can you update us on AMD’s share of that market today vs. 12 months ago, and where you expect to be at end of Q1 2021?
Since launching 2nd Gen EPYC™ CPUs in August 2019, AMD has made incredible strides in performance and workloads. We have achieved 140+ world records across HPC, Media & Entertainment, SDI & Enterprise, Big Data and Cloud workloads. We have an extremely strong ecosystem behind us. With our second generation we have doubled the EPYC-based platforms in market from numerous IHVs, ISVs and Cloud providers. We are on track to reach double digit market share by mid-2020 and look forward to continued growth. We expect share will only get better with the future generations of AMD EPYC CPUs. We are laser-focused on execution and we will continue to meet our commitments.
HPCwire: We’re closely following AMD’s lead in CPU process technology (AMD’s 7nm vs. Intel’s 10nm) – looking forward, what are some other future key areas of competition in server chip technology?
When it comes to CPU process technology, the game has changed. Starting with 7nm, instead of a lag in foundry transistor performance and density, there is now a level playing field. That puts even more emphasis on designing optimized solutions. To stay on a Moore’s Law pace and deliver performance and energy efficiency gains every 18-24 months moving forward, we must pursue additional innovations outside of process technology. We will continue to deliver the performance and energy efficiency advances, and we have a very strong product roadmap for the future. AMD is driving innovations and advances across multi-chip architectures, scalable fabrics like our Infinity Architecture, memory, new packaging techniques, and, of course, software including compilers, open source libraries and frameworks.
HPCwire: Within the datacenter server GPU market, please talk about AMD’s current position and ambitions/vision related to AI/ML/DL.
The data center GPU market is starving for competitive solutions and we are addressing this need. We are growing our Radeon Instinct™ product line and the enabling software. AMD recently released ROCm 3.0, the foundational open source software components for GPU compute, including support for new compilers and HPC applications. Moreover, AMD is optimizing CPU and GPU together for compelling data center performance. The Oak Ridge National Laboratory supercomputer Frontier, expected to be the world’s most powerful supercomputer when it arrives in 2021, features both AMD EPYC CPUs and Radeon Instinct™ GPUs. We are seeing emerging Machine Learning and analytics industry workloads requiring supercomputer-like configurations. This is a very exciting growth opportunity for AMD.
HPCwire: Generally speaking, what trends and/or technologies in high-performance computing do you see as particularly relevant for the next five years?
There is an insatiable demand for more compute capacity and efficiency. This trend will continue to drive our innovative HPC solutions in our industry. The upcoming exascale class computers are driving a new generation of computing capability critical for the advancement of any number of use cases such as understanding the interactions underlying the science of weather, sub-atomic structures, genomics, physics, rapidly emerging artificial intelligence applications, and other important scientific fields. Pushing the bleeding edge of performance and efficiency requires new architectures and computing paradigms. We’ll see an increased mix of heterogeneous computing with accelerators. Portions of applications will diverge from the processing strengths of traditional CPUs. GPUs, FPGAs, and purpose-built ASICs are ready examples. GPUs are adapted with great success to execute neural networks, the deep learning algorithms behind so much of the AI advances. Purpose-built ASICs carry this one step further, however, they are limited to specific use cases.
HPCwire: Outside of the professional sphere, what can you tell us about yourself – personal life, family, background, hobbies, etc.? Is there anything about you your colleagues might be surprised to learn?
My dad was a cancer researcher. He taught me to always ask why and he exemplified how science can play a positive impact on society. That has always been my motivator and I’ve been fortunate to work on very challenging problems and help create great products that have been impactful. It is easy to work hard when you love what you do. My work focus keeps me very busy, but my family keeps me balanced. I enjoy biking, skiing, and pretty much any family event.
I also look for opportunities to volunteer, and we are very engaged at AMD with the communities we live in. I’m a huge believer in education and am a long-time member of the University of Texas Cockrell School of Engineering Advisory Board and the Olin College Presidents Council, affording me with the opportunity to mentor students, influence curricula and grow industry engagement.
All in all, I have always tried to work hard and have a lot of fun along the way.
HPCwire: More on Mark
Mark Papermaster is Chief Technology Officer and Executive Vice President of Technology and Engineering at AMD and is responsible for corporate technical direction, product development including system-on-chip (SOC) methodology, microprocessor design, I/O and memory, and advanced research. He led the re-design of engineering processes at AMD and the development of the award-winning “Zen” high-performance x86 CPU family, high-performance GPUs and the company’s modular design approach, Infinity Fabric. He also oversees Information Technology that delivers AMD’s compute infrastructure and services.
His more than 35 years of engineering experience includes significant leadership roles managing the development of a wide range of products, from microprocessors to mobile devices and high-performance servers. Before joining AMD in October 2011 as Chief Technology Officer and Senior Vice President, Papermaster was the leader of Cisco’s Silicon Engineering Group, the organization responsible for silicon strategy, architecture, and development for the company’s switching and routing businesses. He served as Senior Vice President of Devices Hardware Engineering at Apple, where he was responsible for iPod and iPhone hardware development. He also held several senior leadership positions at IBM overseeing development of the company’s key microprocessor and server technologies.
HPCwire: HPE’s acquisition of Cray was the biggest story in the HPC industry in 2019. So far, what’s been the impact on Cray operating within the HPE umbrella?
Overall, the integration process has been pretty seamless, which I feel really good about. We quickly aligned the teams and the road maps of the two companies, and have been working together as one team since day one. The biggest impact or surprise would be just how complimentary the two companies were. Seeing as how we’re competing in the same market, you would assume there’s a lot of overlap, but there really wasn’t much overlap for us. The typical 1 + 1 equals a lot more than 2 is definitely in play for us.
A big opportunity for us is tapping into the broader HPE technology portfolio, which includes Hewlett Packard Labs and other businesses as well as the broader coverage of customers and countries.
HPCwire: From your vantage points, what’s been the biggest advantage of the acquisition, and what’s been the biggest challenge?
The biggest advantage is very straightforward; it’s the scale of HPE. Everything from customer coverage, to supply chain, to technology, application breadth and industry knowledge. That’s something you can never have in a smaller company so that has been a massive advantage. On the challenge side, Cray was a small and focused company and HPC is big and broad, so how to combine the two, especially from a process perspective and how we do business, is always going to be a challenge because of the breadth and the size.
HPCwire: Please discuss key areas of technology Cray has been working on in concert with HPE and how that will impact the resulting HPE/Cray HPC product strategy.
We have a number of super cool projects in progress that I’m pretty fired up about, but I’m not yet ready to talk about them publicly.
A good example is that we’re taking some key HPE technologies like the HPE Container Platform, HPE GreenLake Central and HPE BlueData and integrating that with Cray technologies to be able to deliver HPC and AI solutions to customers as a Service and do deliver them in an open framework. I think that’s pretty exciting stuff, and over time, we’re going to have some really interesting things based on this.
HPCwire: Generally speaking, what trends and/or technologies in high-performance computing and related fields (e.g., AI) do you see as particularly relevant for the next five years?
Wow, how much time do you have here? To name just a couple: the integration of modeling and simulation with AI and analytics is clearly on everyone’s minds. Where this integration happens, whether it happens in a single workload, workflow or within a single application, that integration is going to happen in all of those different areas and we’re definitely seeing that in a huge way. There’s rarely a customer discussion I have where we just stay on the traditional modeling and simulation workloads and we don’t venture out into AI and analytics.
It’s very top of mind with everybody, and they want to know:
How do we run these?
What platforms do we need to run these various algorithms and how can we run them all in the same place, on the same platform, with the same storage environment to make this seamless for our users?
We need to integrate data between these different areas, and how do we do that?
That’s a really big thing that’s going on in the industry right now.
The second big thing is that companies all over the world – and this is especially true with commercial companies – is they’re trying to deal with their data growing like crazy. A lot of that data is now being generated out on the edge, and with that, customers are now dealing with a whole new environment that they aren’t ready for.
There’s a lot of new applications to use for processing this data in different ways from modeling and simulation to AI, IoT, cybersecurity and 5G. So as companies are thinking about how they’re going to go about their digital transformation, which is what pretty much every company is working through in one way or another right now, they are realizing that these new applications and plethora of data are either not running well on their traditional infrastructures or can’t scale out with commodity cloud-like infrastructures.
To address this, we’re starting to see more people considering supercomputing-style technologies to have a role in helping them deal with this data growth and new application needs. We talked about this a lot when we talk about Exascale Era technologies, which are the same technologies that we’re building into huge Exascale machines, but can also be leveraged by commercial corporations in a single cabinet, or even a single chassis, to help them to deal with these new applications and algorithms. This is a really big thing that a lot of companies are talking about right now, because we know people are struggling with what they currently have to work with, and we’re excited to help because we know we can provide some part of the solutions that they need.
Another trend I’m seeing is that people, especially in commercial companies, are trying to provide their users with a cloud-like experience in their HPC and AI infrastructures. We’ve been working on this for a while, and one of our mottos is that with our technologies, we want to make systems perform like supercomputers and run like the cloud. This is something I think is going to be very important as we think about the broader set of users that will need to use these machines because of the growth of data and the breadth of applications that now need access to HPC capabilities.
I’m pretty excited about these trends and think we’re very aligned with bigger technology needs, because they’re playing into things that we’ve been working on for a number of years and are starting to come to fruition.
HPCwire: Outside the professional sphere, what activities, hobbies or travel destinations do you enjoy in your free time?
With everyone’s busy schedules, my biggest priority is to just spend time with my family, which includes my wife, three kids and our dog. I like to make time for them, especially on the weekends when I’m not on the road.
Whenever I do get some free time I try and fight my age a little bit and try my best to exercise and keep active.
HPCwire: More on Peter
Peter Ungaro serves as a Senior Vice President and General Manager of the High Performance Computing and Mission Critical Solutions business, which includes Artificial Intelligence and Edge, at Hewlett Packard Enterprise (HPE), since HPE’s acquisition of Cray Inc. in September 2019. Prior to the acquisition, Ungaro was President and Chief Executive Officer of Cray since 2005.
Ungaro was named CEO of the Year for 2006 by Seattle Business Monthly magazine for his leadership in turning around the company and one of the “40 under 40” by Corporate Leader Magazine in 2008. In 2013, Bloomberg named him #4 on their list of “Top Tech Turnaround Artists” for generating a total shareholder return of over 361% since becoming CEO of Cray. Ungaro joined Cray in 2003 as the Vice President responsible for sales, marketing and service, and was later promoted to Senior Vice President.
Ungaro was appointed to the United States Department of Commerce’s Manufacturing Council and served from 2010 to 2012. The Council advises the Secretary of Commerce on matters relating to the competitiveness of the manufacturing sector and government policies and programs that affect U.S. manufacturers.
Before joining Cray in 2003, Ungaro served as Vice President of Sales for Worldwide Deep Computing at IBM. In that role, he led global sales of all server and storage products for high performance computing, life sciences, digital media and business intelligence markets. He held a variety of other sales leadership positions since joining IBM in 1991. Ungaro received a B.A. in business administration from Washington State University.
HPCwire: Long a leader in HPC storage, DDN has also become an aggressive acquirer of technology – Tintri, IntelliFlash (from Western Digital) and Nexenta– and pushed into broader enterprise and AI markets. How do the recent acquisitions fit into DDN’s long-term strategy, maybe summarize the strategy, and should we expect more such acquisitions?
Over the last two decades, DDN has developed a strong reputation and a significant global footprint in the areas of AI, Big Data, Multicloud, and High-Performance Computing. Our customers’ sustained trust in us and our products have helped us become and remain the largest privately-held data storage company in the world. With more than 10,000 customers, 20 Exabytes of value-add storage solutions delivered to many of the most demanding datacentric organizations, government and research facilities in the world, 1000 DDN employees most of whom are in R&D and customer-facing technical areas, and 140 patents, we continue to push the boundaries of innovation to always exceed our customers’ needs and wants. Our recent acquisitions of Tintri, Intelliflash from Western Digital, Nexenta, and Intel’s Lustre file system division have added powerful virtualization, predictive data analytics, all flash, container technology, file system, and software defined storage to our product portfolio. These acquisitions have also significantly broadened our market reach into the Enterprise, multicloud data delivery, IoT, Telco 5G and very positively contributed to our sustained rapid and profitable growth. We can now bring to our customers the most comprehensive set of data-centric AI-enabled solutions, a product portfolio that we have unified under the banner of Intelligent Infrastructure for a Changing World. What DDN’s Intelligent Infrastructure solutions deliver is the utmost flexibility, speed at any scale, and data insight on-premises and in multicloud settings. Think of it as freedom from vendor lock-in and from locality, the ability to effortlessly and reliably manage complex and distributed data sets, and AI enablement to maximize and accelerate business insight. Organizationally, we have created two divisions – the DDN at Scale division and the Tintri Enterprise division. Both divisions roll up to DDN’s executive team and its co-founders, Alex Bouzari and Paul Bloch. Our DDN at Scale division focuses on developing and delivering the most performing and innovative data at scale and high performance solutions. We emphasize performance and cost optimization, highest scalability across multiple dimensions, as well as software and analytics enablement. Our Tintri Enterprise division unifies the Tintri, Intelliflash and Nexenta product portfolios and delivers the most fully-featured, AI-enabled robust data solutions for the Enterprise. There is a special focus on offering the highest reliability and resilience, self-diagnostics and predictive data set management, as well as real-time applications analytics. Both our DDN at Scale and Tintri Enterprise divisions are developing products and services which are highly enabling for transparent data management, effortless data movement between on-premises and public clouds in a true multi-cloud sense, and extremely flexible and efficient data movement. With our Intelligent Infrastructure for a changing World focus and some of the highest R&D investments in the industry, we aim to continue to bring to our customers the best solutions for their complex data challenges for many years to come. Innovation and best in class technology, outstanding go to market delivery and most importantly exceptional customer delight are at the forefront of our minds now and always.
HPCwire: Storage technology is always advancing but perhaps receives less attention than it should. Taking a slightly longer-term perspective (2-5 years) 1) what disruptive ‘storage’ technologies are you tracking and fully expect to show up in products, and 2) what technologies look intriguing to you but are perhaps more speculative and just bear watching at this point.
Data movement, automation, and management of the “data sphere” are getting more complicated. You have edge devices generating endless amounts of data, applications that demand both flexibility and high performance, along with a desire to realize the promise of third-party hosted cloud. We have been developing highly innovative technologies to address these challenges in the best way possible.
Our workflow automation technology coupled with significant improvements in overall data portability across disparate environments will help our customers manage their data in ways which are simply not possible today. Advances in storage architectures, networking and multicloud management all play a role, but we believe that the central pillar is flexible and transparent deployment models enhanced and supported by intelligent data set infrastructures which abstract these great complexities behind the scenes. When data platforms are able to anticipate customer need, then data is available when it is needed on the most efficient platform without administrative intervention. With all the talk in HPC about Exascale and how we get there, we’ve been thinking of Exascale as a now problem for IO. It is not just for HPC; Enterprise customers will need Exascale solutions just as much as those in high performance computing. That is the problem DDN’s Intelligent Infrastructure for a Changing World is solving!
HPCwire: The traditional HPC market is undergoing substantial change, most notably blending in AI technologies. This is further complicated by growing enterprise adoption of HPC/AI capabilities. What your take on the new HPC? How will HPC writ large, and HPC storage in particular, change given the new dynamics?
The “new HPC” is obviously a huge opportunity for DDN. We have already deployed at massive scale DDN solutions which are AI enabled in a variety of markets and use cases which one could characterize as Very High Performance. Think Deep Learning, Image Analysis, and Natural Language Processing and other AI technologies at enormous scale and most importantly, in production. Our Intelligent Infrastructure platforms are now significantly easier for the non-technical market to consume. This means solutions that are much easier to architect, deploy and manage in Enterprise IT environments. It’s a true Win/Win. At Scale efficiency with Enterprise ease of use and reliability. We are making all this even easier by layering in data analytics software which makes the infrastructure itself more intelligent.
HPCwire: The exascale era is nigh, with an exaflops system expected in the next two years – what are the demands on high-performance storage? How is DDN enabling/leading this transition?
We see Exascale as a requirement which we are addressing and delivering solutions for today. In DDN product developments starting more than 7 years ago, and which are now in production, we anticipated the need for massive IO flexibility – really wide combinations of small random IO and large streaming workloads. These DDN products and solutions are being enhanced with platform flexibility, data portability, and management, as well as collaboration. One example is DDN’s flash and SSD solutions integrating GPU environments, such as our integration with NVIDIA on Magnum IO and GPUDirect Storage, which significantly streamlines the data path from storage to compute. We are making the underlying infrastructure orders of magnitude faster and more efficient, while also solving the challenges of easily managing and processing data sets at Exascale size. Integrating sophisticated data management capabilities into our filesystem products is one additional step in this direction, and you will see new technologies from DDN which makes administration and movement of data at scale much simpler.
HPCwire: Generally speaking, what trends and/or technologies in high-performance computing (and related fields, such as AI) do you see as particularly relevant for the next five years?
The needs of data-centric organizations continue to grow, and the number and types of companies that consider themselves datacentric is growing significantly. Companies are catching on to the massive data requirements to build AI-enabled models to better address their business and research needs. We are also seeing an increase in the adoption of multicloud architectures. Over the next few years, facilitating this data movement intelligently and really simplifying how data gets to the end user when they need it in the most efficient and reliable way possible is key.
HPCwire: Outside the professional sphere, what activities, hobbies or travel destinations do you enjoy in your free time?
I greatly enjoy travel, which coincides wonderfully with having a global customer base – Everywhere all the time!
HPCwire: More on Alex
Alex Bouzari is a visionary IT leader with over twenty five years of experience in founding and managing profitable, high growth technology companies. Prior to co-founding DataDirect Networks – which is now the world’s largest, privately held storage company. Alex served as CEO of Personal Writer, Inc., and was a co-founder of MegaDrive Systems, Inc. Alex holds Bachelor of Science degrees in Engineering and Economics from the California Institute of Technology (Caltech), with graduate studies in Engineering at the Massachusetts Institute of Technology (MIT) and Stanford University.