Scientists engage 50 thousand “cloud-based” GPUs for an astrophysical experiment

HOSTKEY
hostkey
Published in
5 min readJan 27, 2020

GPU cloud computing platforms may well compete with traditional supercomputers. This was proved by an experiment jointly run by the San Diego Supercomputer Center and the Ice Cube Neutrino Observatory.

The experiment involved over 50 thousand available accelerators located in cloud platforms — North America, Europe and Asia.

Statistical Test Results: GPU types and performance growth dynamics

In total, about 80 thousand NVIDIA Tesla V100s are now available in the cloud. In fact, the experiment engaged the entire array of heterogeneous accelerators available at that time for rent — a total of 51,500 units. It was combined into a single complex using HTCondor software.

The experiment began on November 16th, 2019 and it lasted for about 200 minutes. It pursued three main goals: to find out how sheer power can be used in this way; to identify the real extent of GPU use in the clouds; and finally, to solve a real scientific problem. The graph clearly shows how the power of the “cloud supercomputer” grew; it peaked at around 110 minutes and was around 350 FP32 GFLOPS. For comparison, the leader of the TOP500 list, the Summit supercomputer, develops about 400 Pflops on calculations of the same accuracy.

All segments of the general task were optimized taking into account the features of each of the eight models of available NVIDA accelerators. The operating hours of the segments did not exceed 15–30 minutes to minimize the risk of disconnecting from the service due to sudden demand. We estimate the same cost to be in the range from $120 to $150 thousand for the first day of calculations.

As the subject of the experimental calculations, the data obtained by the Ice Cube Neutrino Observatory were used. This is the world’s largest neutrino detector, located at the Antarctic Amundsen-Scott station. It has an array of 5,160 highly sensitive optical detectors located in special wells at depths from 1450 to 2450 meters.

In 2017, with the help of the new array, it was possible for the first time to fix ultrahigh-energy cosmic neutrinos and track their source. During the hour of the experiment, it was possible to conduct a simulation incorporating such a monumental volume of data from the Ice Cube detectors that it would normally require a month to resolve.

Currently, the field of multi-messenger astronomy is actively developing. Its essence lies in a comprehensive study of everything that astronomical objects can emit, from electromagnetic radiation to gravitational waves and elementary particles. However, such astronomy requires processing huge data arrays.

The experiment showed that cloud systems are suitable for purposes such as these and allow one to deploy serious capacity very quickly, over a short period of time, which is vital for projects with tight deadlines.

Although it was not possible to achieve the initially planned 80 thousand NVIDIA Tesla V100 accelerators, the invaluable experience was gained, which in the future should pave the way for the widespread use of cloud services with GP accelerators in other scientific projects.

But how can projects with a smaller budget get access to high-performance GPU servers and GPU clouds? There is an option with a significantly lower price for GPU computing as compared to Google Cloud, AWS or Microsoft, with equal performance.

HOSTKEY, a premium web service provider, offers one of the most affordable GPU servers for supercomputing based on NVIDIA 1080 / 1080Ti / 2080Ti video cards.

Solutions like these make it possible to offer modern supercomputers to scientific projects at a much cheaper rate than AWS, Google Cloud, and AWS cloud platforms.

We trained popular cats vs. dogs image classification model. This test demonstrated the resulting processing time differences as compared to other cloud providers.

Here’s a chart showing the results by instance with the specs of the server.

Let’s consider Cost per Hour as opposed to Time to Train.

The bottom left corner is where you want to be — an optimal ratio between cost and training time. HOSTKEY’s GTX 1080 Ti *1 and RTX 2080 Ti * 1 are strong options.

If you are looking for higher speeds, HOSTKEY’s RTX 2080 Ti * 8 is the fastest of the options compared in the test, and even so its cost to train is not substantially higher.

If you are on a tight budget, Azure K80 will do the job, just not as fast. The price is comparable to HOSTKEY’s GTX 1080 * 1, keeping in mind that GTX 1080 is 7.5 times faster!

Below is the comparison chart with the results of the Cost to Train. Azure offers the cheapest solution but it is also the slowest. With some investment, HOSTKEY is by far the most cost-effective option.

All solutions offered by HOSTKEY show a significant speed advantage.

You can get results substantially faster on HOSTKEY GPU servers at a significant savings.

Furthermore, one of the advantages of HOSTKEY is that this platform offers not only virtual GPU servers with dedicated resources, but also dedicated servers, and GPU private clouds of any complexity that most efficiently tackle the workload of the project.

HOSTKEY supports research projects and promising startups that contribute to the development of practical services. We offer free grants to projects and provide free GPU servers for solving specific problems. In order to apply for free computing power, please apply for a free trial GPU server on hostkey.com or write to the marketing department at marketing@hostkey.com

Source 3dnews.ru

--

--