For nearly five years Lomonosov-2 has been Russia’s top-ranked supercomputer. In a new paper, a team of researchers from Lomonosov Moscow State University (MSU) – where Lomonosov-2 is hosted – elaborated on the supercomputer’s history and specifications, as well as the software tools they have developed.
MSU has been a hub for some of Russia’s most powerful computing systems since the installation of Strela, Russia’s first mass-production computer, in 1956. Over the intervening 63 years, MSU has seen a parade of new systems; in 1999, it deployed its first computing cluster, and a decade later, the first “Lomonosov” system (named after Russian polymath M. V. Lomonosov) was installed. The current iteration of Lomonosov “1″ is benchmarked at 902 Linpack teraflops owning to more than 12,000 Intel CPUs, and 2,130 Nvidia Tesla X2070/X2090 GPUs.
In 2014, MSU saw the installation of the first of four stages of Lomonosov-2. By 2018, the fourth and final stage of the successor system had been installed.
Lomonosov-2 features Intel Xeon E5-2697v3 14-core and Gold 6126 12-core CPUs, along with Nvidia K40 and P100 GPU nodes, and 2.5 PB of system memory. Just last month, the system received an upgrade: an additional 0.5 PB of storage, bringing it to 3 PB.
Spanning 1,679 compute nodes, connected by InfiniBand FDR, Lomonosov-2 is hosted in seven racks: six racks with 256 nodes each, and a seventh rack with 160 nodes and the InfiniBand and Ethernet switch systems. The system is divided into three partitions: Compute, Test, and Pascal. The resources are distributed among the nodes as shown below:
Lomonosov-2 utilizes the CentOS 7 operating system and makes available several versions of Open MPI and CUDA, as well as a host of other utilities, as seen in the chart below:
From its initial installation in early 2014, when it delivered 320 Linpack teraflops, to present, Lomonosov-2 has been positioned in the Top500 list, earning its highest spot (#22) in late 2014 after its second stage was delivered. Lomonosov-2 now ranks at #93 with 2.5 Linpack petaflops.
Since it overtook Lomonosov in November 2014, Lomonosov-2 has been ranked as the most powerful supercomputer in Russia. The country’s second-fastest machine, a Cray XC40 installed at the Federal Service for Hydrometeorology and Environmental Monitoring, offers 1.2 Linpack petaflops and ranks #364 on the latest Top500 list. MSU’s “Lomonosov” supercomputer remains in third place with 902 teraflops, per Russia’s “Top50” list.
The paper, published in the Journal of Supercomputing Frontiers and Innovations, also reviews the software tools that MSU computational scientists have developed to manage the complexity of the center’s supercomputers. The main components are:
• Octoshell — HPC center management system
• DiMMoN — a system for deep monitoring of supercomputer parameters
• Octotron — a system to ensure reliable and autonomous functioning of supercomputers
• JobDigest — a visual tool to analyze the dynamic characteristics of parallel applications
• an expert software system to bring fine analytics on parallel applications and the entire supercomputer to users and sysadmins
The authors note that the software systems are actively used on Lomonosov-2 supercomputer, providing operational data for users and administrators of the supercomputer center. “The increasing complexity of computer architecture and the growth of the degree of parallelism are characteristic features that are typical for all, without exception, modern large supercomputer systems,” they write.