How to Create Tor Proxies

Jack Paczos
Get that Data!
Published in
5 min readJan 30, 2024

This post shows how you can create as many tor proxies as you want for web scraping.

~ Bear in mind that a large number of proxies will cause issues with the memory and cpu usage of your machine.

~ Also you should not make millions of requests per minute, since it will stretch the tor-network.

Requirements:

  • Basic understanding of Python
  • Basic understanding of Docker
  • You need to have tor set up. If you don’t know how, here is a guide.
  • You need to have Docker installed as well. Guide.

‎‎‎

Steps:‎‎

  1. Finding Image
  2. Docker Compose File
  3. Run Containers
  4. Testing + Script

‎‎‎

1. Finding Image

The first time I tried that, I followed a post from Datawookie and it worked perfectly. Here is this well-written guide on how you can setup tor proxies with the pickapp-image on docker hub.

However, it only worked fine till I created 12 or less tor proxies. After that I started to face some issues. The more containers I created, the more issues appeared. The containers were still created, but the curl requests just didn’t go through. I still don’t know to a certainty what the issue was. This is likely due to the extensive memory usage of the pickapp-image.

docker desktop
pickapp proxies

That’s why I recommend using the dockage-image. It requires far less of your machine’s CPU power and memory.

dockage proxies

Still, to use the dockage-image you need to have privoxy and tor installed and running.

brew install privoxy
brew services start privoxy

I assume you have tor already setup, if not you can install and run it, the same way as privoxy.

You can run:

brew services list

This should show that both, tor and privoxy are running.

Name    Status  User  File
podman none
privoxy started jacek ~/Library/LaunchAgents/homebrew.mxcl.privoxy.plist
tor started jacek ~/Library/LaunchAgents/homebrew.mxcl.tor.plist
unbound none

‎‎‎

2. Docker Compose File

The Docker-Compose file is a convenient way to configure multiple containers in one file. I will go into detail in another blog post.

For the sake of this guide, all you need to know that our file will look somewhat like this and will make each proxy change the exit node (your projected ip) every 60 seconds.

version: '3'

services:
tor-0:
container_name: 'tor-0'
image: 'dockage/tor-privoxy:latest'
ports:
- '25000:8118'
environment:
- IP_CHANGE_SECONDS=60
restart: always
tor-1:
container_name: 'tor-1'
image: 'dockage/tor-privoxy:latest'
ports:
- '25001:8118'
environment:
- IP_CHANGE_SECONDS=60
restart: always

We could hardcode this file, but why? Datawookie created a script, which I have edited so we can create as many dockage-image proxies as we want.

You can just copy this python script into a file, enter the number of desired proxy containers in the range of the loop, and run the code.

WARNING = "# Generated by create-proxies script.\n\n"

# Generate docker-compose.yml.
#

with open("docker-compose.yml", "w") as f:
f.write(WARNING)
f.write("version: '3'\n\nservices:\n")

for i in range (15):
f.write(f" tor-{i}:\n")
f.write(f" container_name: 'tor-{i}'\n")
f.write(" image: 'dockage/tor-privoxy:latest'\n")
f.write(" ports:\n")
f.write(f" - '{25000+i}:8118'\n")
f.write(" environment:\n")
f.write(" - IP_CHANGE_SECONDS=60\n")
f.write(" restart: always\n")

~ 20–50 should be more than enough for all scraping purposes

After running the python file (gen.py) a file called docker-compose.yml should appear.

‎‎

3. Run Containers

By now you should be ready to run the containers.

First ensure nothing else is running on docker:

docker ps

Start the containers. Be sure, to be in the same directory as the docker-compose file.

docker compose up -d

Now your containers should be running, you can confirm this be running again docker ps or just going to the docker app.

‎‎

4. Testing + Script

First make a curl request to check your ip address:

curl http://httpbin.org/ip‎
{
"origin": "192.42.116.175"
}

You can test it by doing a few curl request to httpbin.org (shows ip).

If it shows a different ip than your ip in the previous request, your most likely good to go.

curl --proxy http://127.0.0.1:25000 http://httpbin.org/ip
{
"origin": "185.241.208.232"
}

Be sure, to only make requests to ports for which you have a container.

If you have more than 10 proxies, this testing can get a bit mundane.

That’s why it’s useful to use a script to test these proxies. The one below simply tests if the string “origin” appears in the returned response. It is by no means perfect.

Still, it is enough for our purpose, since we would get an error message if the proxy would not work and the string “origin” would not appear.

import requests
from rich import print

num_proxies = int(input("Enter the number of proxies: "))

proxies = []
for i in range(num_proxies):
proxies.append('http://127.0.0.1:{}'.format(25000 + i))

successful_proxies = []
failed_proxies = []

for proxy in proxies:
try:
response = requests.get('http://httpbin.org/ip', proxies={'http': proxy})
if 'origin' in response.json():
successful_proxies.append(proxy)
except requests.exceptions.RequestException:
failed_proxies.append(proxy)


# Printing the results
print(f"Working Proxies: {len(successful_proxies)} of {num_proxies}")
print("Failed Proxies:")
for proxy in failed_proxies:
print(proxy)

This can take a few seconds or a few minutes, depending on the number of proxy container you have created. Sometimes not all proxies work. However for scraping, even the hardest websites can be scraped with 20–50 working tor proxies.

Enter the number of proxies: 15
Working Proxies: 15 of 15
Failed Proxies:

Hooray! All of our proxies are working!

PS: I recommend you always import the print from rich library to make things easier to spot when testing or debugging.

--

--