Online Programming Learning Sites Can Be Manipulated By Hackers To Launch Cyberattacks

Published in

ProferoSec

7 min readJul 7, 2022

Introduction

Hackers commonly launch their attacks using compromised machines rather than directly from owned devices, which allows them to conceal their origin. In recent incident response, Profero’s Incident Response Team investigated a possible scenario where we assumed that threat actors used Datacamp’s online IDE to launch an attack against a cloud infrastructure. However, the reason was a simple lousy attribution, a mix between Datacamp, the ISP, and the online IDE. We were intrigued by the idea of using cloud IDEs to hide the origins of an attack and initiated a research project to explore this strategy.

Online IDEs

With some simple Python commands, it is straightforward to access public resources and use leaked credentials to download content or perform reconnaissance. Unfortunately, many users and organizations do not properly configure their resources and cloud environments, and malicious actors know how to find these misconfigurations and scrape them with scripts.

Easy subscription and quick code execution can appeal to attackers looking to cloak their identity. In addition, blue teams would most likely overlook activity from the Amazon AWS address range, in contrast to other unfamiliar services. Online IDEs provide these advantages, and most are great for basic Python commands, which do not require third-party modules. Usually, a user must install external modules using the pip install module or use external files to import modules. These actions cannot be done with simple editors, as most online editors are fundamental and would look something like this:

Source: https://www.programiz.com/python-programming/online-compiler/

Two high-quality examples of advanced online Python IDEs are DataCamp and Binder. They give the user more abilities than ordinary online IDEs because they allow files to be uploaded using a terminal.

Screen captures illustrate differences between Jupyter and Datacamp editors.

If you are unfamiliar with DataCamp, it is an online platform for learning data science. In DataCamp, one can find online courses and an advanced Python IDE. In addition, the platform allows users to start and follow code exercises and perform advanced commands quickly.

Users can use existing datasets, recipes, and templates or create blank workspaces. A Jupyter lab IDE opens when a new workspace is made.

Attack? Is it possible?

One of the examples of usage on Datacamp’s website is a demonstration of how to connect to a PostgreSQL server.

So, if it is possible to connect to an SQL server in the wild, why wouldn’t it be possible to connect to any other service? What about cloud services or an AWS S3 bucket?

The terminal allowed us to install third-party modules that helped us interact with external resources.

To access AWS resources, it is necessary to install boto3:

Our next step was to connect to an S3 bucket to list and download all files from it:

We then downloaded these files to the workspace environment on the website.

From the defender side, the CloudTrail logs of file download resemble the truncated example below:

"eventSource": "s3.amazonaws.com","eventName": "GetObject","awsRegion": "eu-west-1","sourceIPAddress": "34.192.118.171","userAgent": "[Boto3/1.17.22 Python/3.8.10 'Linux/5.4.176-91.338.amzn2.x86_64 Botocore/1.20.53 Resource]","requestParameters": {"bucketName": "test_bucket","Host": "test_bucket.s3.eu-west-1.amazonaws.com","key": "test.jpg"

For an event of file uploading:

"eventTime": "2022-05-13T17:12:21Z","eventSource": "s3.amazonaws.com","eventName": "PutObject","awsRegion": "eu-west-1","sourceIPAddress": "34.192.118.171","userAgent": "[Boto3/1.17.22 Python/3.8.10Linux/5.4.176-91.338.amzn2.x86_64 Botocore/1.20.53 Resource]","requestParameters": {"bucketName": "test_bucket","Host": "test_bucket.s3.eu-west-1.amazonaws.com","key": "test.jpg"},

In both cases, it is clear that the user agent is:

‘[Boto3/1.17.22 Python/3.8.10 Linux/5.4.176–91.338.amzn2.x86_64 Botocore/1.20.53 Resource]’.

This allows for quick identification of activity from a Python Script utilizing the boto framework.

In addition, looking at the source IP address shows the traffic comes from an EC2 instance in Amazon, as a simple Google search shows.

Since DataCamp uses AWS, this kind of activity can go undetected by some blue teams. And even those who further inspect the connection would hit a dead end because there is no known definitive source listing the IP range of Datacamp.

These are basic examples of attacks that could be performed, but other scenarios like utilizing Github API, Azure API, and any online resource are susceptible to this method.

What else?

We wanted to take a step further and see how far we could push these IDEs. So the next step was to try to import or install known network attack tools.

To simplify this step, we picked Nmap (https://nmap.org/) as our weapon of choice.

Before jumping into Nmap, we checked whether Datacamp limits any socket operations. We used a straightforward network scanning script written in Python.

The code worked without limitation.

Nmap

Since Datacamp does not support apt-get commands, and it’s therefore not possible to install Nmap, we tried to compile it from the source directly.

It is simple to compile Nmap on Datacamp, but due to permission limitations, we were not able to install it. However, we were able to execute the binary from the compilation directory.

Here is a simple execution of Nmap on Datacamp:

This demonstrates how easy it is to bring or compile any binary on Datacamp and utilize it for malicious purposes.

Malware hosting

As part of our examination, we wanted to check if there is any malware or virus scanning to files uploaded to Datacamp. To evaluate this, we used the known Standard Anti-Virus test string, or EICAR (www.eicar.org).

As previously noted, it is possible to upload files — every file, including malware — with no malware scanning.

In a harmless example, we were able to show that the EICAR test file was uploaded successfully:

We wondered if the file is already hosted on Datacamp. Wouldn’t it be nice if Datacamp could be used as a platform for malware distribution? Could there be a way to get a shareable link to download the file? Indeed there is:

The download link could be used by malware to download further stages onto an infected system using a simple web request. Hosting malware in third-party services is quite prevalent and occasionally happens with services such as Discord, Github, Pastebin, and others.

Not only DataCamp

We do not believe that Datacamp is the single IDE of its kind, and we do not mean to target or defame them. However, they are very known in the field and provide more capabilities and advanced services than their peers. Below you can find a table comparing other online IDEs we came across during our research.

Not only Python

Online IDEs for various languages have existed for a while, though only recently have their abilities expanded notably. In the past, only simple editors were available — now, new advanced editors appear. We’ve seen examples in Python, but other languages are available, and all of the above applies to them. Paid plans expand the variety of advanced online IDEs, but paying makes the process a bit more complicated because it requires attackers to take extra steps to cover their tracks.

Recommendations

Although Online IDEs can launch many attacks, it seems that threat actors who target Cloud infrastructure will find them very useful. We highly recommend reviewing our “From the Trenches: Common-Sense Measures to Prevent Cloud Incidents” blog post, which offers a deep dive into similar attacks and how to prevent them.

Though it’s not currently possible to map out the full extent of DataCamp’s IP range, if you find malicious attacks against your infrastructure that originates from Amazon’s AWS-owned IP Address, we highly recommend reporting it to the AWS Abuse team. A detailed description of how to submit a report can be found here.

Profero also encourages DataCamp and other providers to keep a publicly accessible list of their outgoing customer traffic gateways and provide a safe and easy way for users to submit abuse reports.