Web Scraping MQL5 Signals With Python
write an easy python web scraping app from scratch
What is MQL5?
In few words, it’s a platform that centralizes signal providers, people who supposedly know how to trade in the forex market, and are selling their calls to people who have the money to buy it. If the signal provider is indeed a great player, he’s going to have a great performance and so do you.
In case you want more information, you can check it here.
The application: what is it going to do?
It’s going to get the information from the website and bring it to a csv file.
After that, you’re free to upload that csv to Google Sheets or Excel to apply filters and find the perfect fit signal, given that MQL5 doesn’t provide that functionality.
Python
First things first, you need to have at least Python 3.7 installed. You can check this guide if you don’t feel comfortable doing it all by yourself.
Using global dependencies over your python projects is not a good practice.
Imagine having several projects depending on the same files — in our case the dependencies. Whenever one of those dependencies is updated it could break all your other projects.
So instead, create a python virtual environment, this way all the dependencies will be locally in the project’s folder.
To do this, go to your terminal, cd to the project’s root folder, and type the code below.
python3 -m venv env
The virtual environment named env is created. Now let’s activate it!
source env/bin/activate
By doing this your terminal should have an (env) prefix indicating that you’re now in the virtual environment.
Let’s add some dependencies for the project.
pip install requestspip install beautifulsoup4
Now we’re good!
Implementation
First of all, let’s create our csv file. To do that, we writing the code below.
import csvfilename = ‘signals.csv’f = csv.writer(open(filename, ‘w’, newline=’’))
Now that we have created the file, let’s write our columns’ titles.
f.writerow([‘Name’, ‘Price’, ‘Growth’, ‘Monthly Growth’, ‘Subs’, ‘Funds’, ‘Weeks’, ‘Trades’, ‘Winrate’, ‘DD’])
Great! Now we just need to find the information inside the web page and write the next row in the same order as seen above.
If you visited the website, you will see something like this. All the information we need is there, we just need to extract.
Let’s now import requests to make an HTTP-get request to the website. Also, import BeatifulSoup to help us find information through the HTML.
import requestsfrom bs4 import BeautifulSoup
Learn more about BeautifulSoup here.
Alright. Time to make the get request and make our soup!
html = requests.get('https://www.mql5.com/en/signals/mt5/list/')soup = BeautifulSoup(html.text, 'html.parser')
The html.text property has all the HTML from the get response and we’re constructing a BeautifulSoup out of it, so we can find data in that response.
Now, let’s go back to the MQL5 website and use the chrome developer tools to inspect the HTML. Press F12 in Chrome to open the developer tools. Now press Cmd + Shift + C to start inspecting by hovering the mouse over the elements from the website.
We are interested here in finding the div that contains all the signals.
By carefully looking into the div hierarchy, we can find the div whose class is “signals-table” and, in fact, it contains all the data we’re looking for.
But how do we translate that into code? Easy peasy… with BeatifulSoup :)
page = soup.find(class_=”signals-table”)
Now we need to get the information for every row in that table. For that, we’re going to use the inspector tool again to find which div represents the table’s row.
By searching the HTML hierarchy again, we found out that the div that represents the row is the one whose class is “row signal”. Now let’s bring it to the code.
list = page.findAll(class_=”row signal”)
Note that we used findAll method this time, which will return a list of all matched data.
Now it’s time to get the name, price, growth, and all the other information from those rows. We will need to iterate over those rows, right? Let’s do it.
page = soup.find(class_="signals-table")list = page.findAll(class_="row signal")for row in list:
#Write something here
Going back to the inspector tool, we’re going to inspect each element (name, price, growth, …) to find out about its properties.
Let’s go for the name first.
It’s a span whose class is “name”. We want its text. Or in coding words:
name = signal.find(‘span’, { ‘class’ : ‘name’ }).text
What about the price?
It’s a div whose class is col-price. We want its text. Or in coding words:
price = signal.find(‘div’, { ‘class’ : ‘col-price’ }).text
So it’s basically the same thing for every property. The final code for every property should be like the one below.
name = row.find(‘span’, { ‘class’ : ‘name’ }).textprice = row.find(‘div’, { ‘class’ : ‘col-price’ }).textgrowth = row.find(‘div’, { ‘class’ : ‘col-growth’ }).textmonthGrowth = row.find(‘div’, { ‘class’ : ‘col-growth’ }).get(‘title’, ‘no title’)subs = row.find(‘div’, { ‘class’ : ‘col-subscribers’ }).textfunds = row.find(‘div’, { ‘class’ : ‘col-facilities’ }).textweeks = row.find(‘div’, { ‘class’ : ‘col-weeks’ }).texttrades = row.find(‘div’, { ‘class’ : ‘col-trades’ }).textwinrate = row.find(‘div’, { ‘class’ : ‘col-plus’ }).textdd = row.find(‘div’, { ‘class’ : ‘col-drawdown’ }).text
Now that we’ve gathered all the information, we’re writing another row in our csv file.
f.writerow([name, price, growth, monthGrowth, subs, funds, weeks, trades, winrate, dd])
Done! Now you can run your code and see that csv file created in your projects root folder.
The code should look like this so far.
This isn’t over though. We have managed to get the information from only one page. Now we have to deal with pagination.
If you scroll down the page, you’ll see the pagination section. Access any page and notice that the address now is like below where N is the page number you are accessing.
https://www.mql5.com/en/signals/mt5/list/pageN
So if we manage to find the last number from pagination, we can loop from 1 to N, and make requests to https://www.mql5.com/en/signals/mt5/list/page appending the index to the address.
That’s right! Let’s find the last number from pagination then. Looking at the inspector, we can see that we have a div whose class is “paginatorEx”. To code:
soup.find(class_=”paginatorEx”)
Now we just need to find all the “a” elements inside the div and get the text from the last element.
amountOfPages = int(soup.find(class_=”paginatorEx”).findAll(‘a’)[-1].text)
You can get the last element from a list in python by acessing index -1
Now we loop through 1 to amountOfPages and search for the data for each page with the code that we have already written.
amountOfPages = int(soup.find(class_=”paginatorEx”).findAll(‘a’)[-1].text)for pageNumber in range(1, amountOfPages):
address = ‘https://www.mql5.com/en/signals/mt5/list/page' + str(pageNumber)requestedHtml = requests.get(address)pageSoup = BeautifulSoup(requestedHtml.text, ‘html.parser’)page = pageSoup.find(class_=”signals-table”)list = page.findAll(class_=”row signal”)for row in list :
(...)
This is the full implementation.
Now you can run your application and see that the csv file has way more information to filter! :)
You can find the code for this sample at this github repo.
Let me know what you think about this article in the comments section.