Automating and Scraping Like a Growth Hacker

Julien Le Coupanec
Hackisition
Published in
14 min readFeb 5, 2020

Click here to download this essay in PDF.

If I had to talk about one specific unfair advantage that a growth hacker has to master, I think that the capacity to automate processes would graze my first thoughts. By having an imagination that allows them to create experiments that they successively test and improve, automating the ones that has been validated enables them to grow faster in an efficient fashion.

For that, there are two methods: by using automation tools (iMacros, Phantom, Murgaa Recorder…) or when human intelligence is needed by calling out to Mechanical Turk or some virtual assistants on Upwork. Of course, do not forget to keep a playbook up to date.

Automation is a veritable mindset that can sometimes makes all the difference between one entrepreneur’s execution and another’s. To illustrate this point, the number one anecdote I use is Neil Patel, the founder of KissMetrics, Quicksprout and CrazyEgg. Neil Patel is probably one of the best figure to follow on the topic of content marketing. Given that content strategies are based upon the 10% writing/90% distribution rule, I kept asking myself what these entrepreneurs had in the morning to be fruitful and write a quality article every day (even containing infographics sometimes) on each of their sites. After much research, I concluded with, “Ghost writing is the way to go”. And what happened was, over time, Neil Patel had created an expansive network of professional ghost writers he found through TextBroker, UpWork and Probloggers. Of course, he’s the one managing the editorial line and determining the relevant subjects, but the content creation (very time consuming) is entirely outsourced. Relying on his data, he knew the exact amount he needed to put on the table for each article. It was upon this model — creating intelligent SEO strategies and great social media management — that Neil Patel managed to automate his content strategy and benefit from the profitable organic expansion.

“ The main thing to understand is that successfully automating an experiment allows you to both hone your concentration in on the new strategy and to optimize your most essential resource: your time. ”

#1. Detecting a Growth Driver.

There are an increasing number of ways to attract a user’s attention. Our world is undergoing a permanent transformation: The digital realm continually connects people to each other. While the most popular resources grow more quickly (today Facebook has over 1.4 billion active users each month), still, new platforms come out each year and grow at exponential speeds. Just five years ago, there was no Pinterest, no Vine and definitely no Snapchat.

“The stakes are huge because of “superplatforms” giving access to 100M+ consumers.”

— Andrew Chen, Supply Growth at Uber

A growth hacker is entirely aware of this situation, and this is where they gain their enthusiasm for the discipline: we feed ourselves with all this information, and the beauty of it is that a large majority is publicly accessible. This ability to retrieve, aggregate and analyze this data is a fantastic power allowing a growth hacker to target potential clients in an extremely pertinent manner. Understand that there is a multitude of methods of finding out if somebody is a Lamborghini amateur: a like on their Facebook page (which you can retrieve from the Facebook Graph), a follower on Twitter, a pin about the Murciélago on a Pinterest board, using the #Lamborhini hashtag on Instagram. There’s a long list, but one thing is for sure, it will get larger and larger in the future. We can say it, we definitely live in an exciting time.

“ Now, we could have stopped there, but you’d only be left with one part of the equation. While it’s true that those resources allow us to easily create prospect groups, the strongest ability of a growth hacker is putting all these resources together and taking away the most value possible. In my experiences, when I look at potential growth drivers, I always act with the mindset of target, interact to finally drive. ”

Remember this, in growth hacking, when you’re trying to benefit from the growth of another platform, the priority is still to target intelligently. And that’s the first block on our diagram: having a well-defined target. Someone that you can active on your application. The goal here is to determine what these external networks have to offer and how you can segment in order to focus on individual profiles who might be interested in what we are doing. In other words, it may be a list of names and last names, accompanied with other important personal information (like the city, job, current business, a Twitter handle…).

Once this list of prospects is well-defined, the main focus then lies in making yourself stand out during the interactions we create with each of them. Plus, he who says interaction says moment and timing. Remember that, as a growth hacker, we can debate over the best acquisition channels for hours, but what needs to remain at the heart of our concerns is very often related to timing. You can’t sell a how-to romance book to somebody who’s been in a relationship for the past five years. Conversely, like you will come to realize that a person just signed up to a dating site after going through a bad breakup. Your argument immediately becomes more convincing, and it becomes infinitely simpler to push to the sale contract. The aim is now to get this information as soon as possible (and by all means available). 😉

Timing is always closely related to the performance of an action.

The main thing you need to understand about this notion of interaction is that you must try to make yourself stand out. By favoriting a tweet, including a particular hashtag or by visiting a LinkedIn profile, our targets automatically receive a notification. It will catch their eye, and some of them will visit us in return. In the same token, by adding a relevant subject to your email (which must be personalized), you gain clicks, and the content spreads (GrowthGeeks is doing some pretty good job at it, you should register to their service and check their emailing tactics). Still following this logic of interaction, another hack is to share a Google doc to all the email addresses you have. That way, they receive an email from Google, which works great for open rates because Google is a name people have a certain level of trust who won’t appear in the spam folder since the message is sent from their SMTP servers.

OK, so interaction has taken place. Now that we have their attention, it’s time to drive them towards our objective. In the majority of cases, it’s about redirecting them towards our site or making them download our application. With this, it’s a question of being an excellent salesman, and here lies one of the secrets: A growth hacker needs to have the skills and permanent mindset of a seller. And for this, you need to convince and argue (very often in written form, but this may also take place over the phone if you have a cold calling setup and especially for B2B companies) in order to explain why your target needs to look at your solution and should agree to get a demo from you. This is the entire art of copywriting and storytelling. In our examples in the preceding paragraph, it’s the content of a Google doc, of our Twitter timeline, of our future tweets in the case of a follow, and of our LinkedIn banner that are the principle elements of redirection. Obviously, these things for the most, do not scale but can be enough to get a steady stream of high targeted users that we will use to iterate around our product.

How I growth hacked
a dating website
by sticking to this pattern.

As you may see it, by following this pattern — target, interact, drive — it’s possible to identify several traction hacks. The main idea lies into redirect a first group of quality users (often under a continuous flow) towards your product so as to help us in this quest for value. Train your eye to spot those small elements that indicate whether a specific profile might be a worthy target for your business. Seek out points of interaction to get yourself noticed, and turn them into redirection points towards your product and what you’re doing.

But be careful: Keep in mind that you must always manually determine the profitability of an experiment before you begin automating it. Only automate when you’re certain of the impact it will have. ”

This is why it’s crucial to carefully measure and record the results so you can understand and optimize what has worked in the past. And this is the entire reason for adding a fourth block to this diagram: measure. Sometimes, this can be a delicate act as you work through external platforms which typically don’t allow even the slightest access to their code. On the other hand, certain small tricks can allow you to have a first glance at the effectiveness: tracking links (bit.ly & co.), getsidekick for cold emailing (tracking open rates and clicks), analyzing referrers on Google analytics and the internal usage of analytic dashboards when available, such as on LinkedIn (Premium), YouTube or Slideshare.

YouTube and its player also indirectly use this pattern as a foundation. It’s through the possibility of easily embedding a video onto any page that he world’s second largest search engine has obtained a huge amount of visibility on the web (largely on the back of Myspace). This small feature has played an essential role in the history of YouTube and has been an incredible experiment for effectively growing through pre-existing networks. This allows you to display your brand on a growing number of web pages and redirect millions of Internet users to the service (mainly by clicking on the video player). About.me also made us of the mindset to encourage users to add their profile page inside their twitter bio.

#2. The automation tools you need to know (and master).

Automation tools are particularly numerous. As for me, I only use three of them on a daily basis. These three allow me to have the flexibility and agility required for 100% of the situations I come across. I have never been blocked, and I’m still able to automate what I initially had in mind.

#2.1. Murgaa recorder.

Only available on Mac, Murgaa Recorder allows you to record your mouse movements as well as your keystrokes. You will then have the option to continuously repeat them and, most importantly, at a high speed. This tool is perfect for gaining execution without a single line of code and for automating processes when JavaScript prevents you from performing actions with one of the solutions listed below.

If you don’t have a Mac, take a look at AutoIt. Note that, unlike Phantom and iMacros, Murgaa does not allow you to scrap data (or it’s a painful process).

#2.2. iMacros.

iMacros is an extension for Firefox, Chrome and Internet Explorer. This tool enables you to build and automate the vast majority of actions you take everyday in your browser in a snap. It is the perfect solution for validating assumptions and semi-automating a growth engine for multiplying its firepower (the aim also being to collect more data or to do actions at a better scale in order to verify a linear impact).

The main drawback of iMacros (which you must remember) is its semi-automation aspect. The overarching problem here is that you need to restart the macro by yourself once it’s completed. But in reality, you have a great alternative with the Enterprise version ($995), which allows you to manipulate iMacros with its API, thus fully automating a process with the help of a script (e.g. Bash).

We will have the opportunity to intensively cover this tool in our upcoming articles. Subscribe to my newsletter at the bottom of the page so you don’t miss a thing.

“ I recommend you use iMacros on Firefox only. This version is better updated, its interfaces are much more intuitive, and you will gain the ability to code your macros in JavaScript (which is useful for programming loops or complicated conditions). It’s also possible to use iMacros without any code, but I recommend learning the language and not being afraid to tweak things around so you can move forward. ”

Also have a look at this cheatsheet to get an overview of the language’s features.

#2.3. Phantom, Casper, Spooky & Nightmare.

Phantom and its additional libraries, which we will introduce later on in this article, allow you to automate and manage a process with great flexibility. You will be able to imagine and plan a series of complex actions, to work in various parallel processes and have them communicate with each other via a database (MySQL, mongoDB, PostgreSQL, etc.). What Phantom allows you to do in the end is simulate a web browser and control it using an API in JavaScript.

As a general rule, when beginning to approach Phantom, I recommend checking the efficiency of your assumptions beforehand. It’s always quicker to test the impact of an experiment or the limits of a system either manually or with iMacros/Murgaa. Furthermore, using these solutions, you will be able to see the chain turn live on your screen. This will make it easier to realize when you’re about to take an action that you shouldn’t (and being banned).

Here’s a little diagram to conclude this part and represent all of this:

Now let’s take the time to elaborate on Phantom’s various complementary bookstores:

Casper and Nightmare allow you to dramatically simplify how you write your bots. What you need to know about Casper is that it allows you to add features on top of Phantom (for example waiting a popup is opened before performing an action). This cheatsheet should give you a complete overview of its multitude of functions. If you are already using Casper, you might need to transplant a database and deploy your bots from a node application. For this, Spooky is the best solution.

Selenium is also an interesting alternative for automating any action you need to take in a web browser. I’ve never really used this one in depth since Phantom meets my needs, but don’t let that stop you from taking a look.

2.4. A Word on Official APIs.

Before beginning to simulate actions or to scrap data on an external network, check first if an API to do that is already available. Facebook, Twitter, Meetup, Github and many others open and maintain APIs that allow you to build applications on top of them. Using those remains the most durable solution for building applications that work over a long period of time.

But as is often the case, due to some restrictions (such as Linkedin recently closing the main functions of its API), Phantom may be the ultimate weapon to offset this problem.

#3. Find the Limits of a System.

When you need to, there is a very effective way to determine the limits of a system from the outside: brute force. This is about acting on a server until you reach its limits (functional restrictions, account suspension, banning your IP). By carefully measuring when the service detects you and gradually winding down, you end up coming across a situation where you are able to work under the radar. And that’s the whole point of iMacros or Phantom: By refining and multiplying your strike force, you have the ability of reliably knowing that you haven’t exceeded the number of requests.

Before measuring these limits, at least consult Google to ensure that someone hasn’t already found them (here are all the limits on Twitter). To get an idea of what excesses not to commit, you can also take a look at the official documentation for APIs and the general terms of service. Now, a very important point to understand is that this little game is not without risk. Most times, scrapping or acting through brute force may end with a permanent ban of your IP address. Since it’s always better to prevent than to cure, your assurance is in using proxies.

“ When you run your tests, the number one rule is to never use your own personal IP. Even if my experience of sending an apologetic email was enough to get my IP unblocked and my account reactivated , I noticed that the monitoring algorithms tend to be less tolerant of an account or a reactivated IP address. ”

If you’re looking for a private proxy provider that’s a good value for the money, I suggest you take a look at Instant Proxies. For $10 per month, they allow you to obtain 10 private proxies. As for myself, I’m particularly pleased with the bandwidth, and I’ve yet to come across a downtime on the service.

As an alternative, if you do not want to pay, remember that there are thousands of public proxies available on HideMyAss(you can also attempt to join the TOR network). But sometimes it’s just better to invest in a suitable bandwidth. Be advised that SSL isn’t always available in all public proxies.

#4. Where Can One Go with Automation?

Very far. Very, very far. In the introduction, I discussed the case of Neil Patel and his ghost writers. Automation is what needs to be the focus of your daily thinking. What I automate today is one less thing I need to do tomorrow. Often with a tenfold impact, I earn in execution, I improve the quality of my service, and I have time to innovate and concentrate on the tasks that matter.

What is fascinating is that it is now possible to automate tasks that need human intelligence. Amazon Mechanical Turk for instance allows you to hire workers who will continually perform your repetitive processes. And tomorrow, with the exponential growth we’re witnessing in algorithms, it will also be possible to automate an increasingly complex range of tasks. Also imagine for a second the benefits and the impact that a database can have when you wish to target a network, carry out a calculation and then interact with another. Target using the Facebook Graph, interact from LinkedIn, and then guess and send an email. The combinations are endless.

For those who want to automate the deployment of Phantom robots, the most exciting solution in my opinion is to use a Digital Ocean droplet. For exactly $5 per month, you can have access to an Ubuntu instance in the cloud, around which all your process will revolve. You can start performing all this independently, outside of your local machine.

When wishing iterate on a product, the most important thing is to get a continuous stream of testers. The 3-stage pattern presented above aims to provide a base for reflection on this point. In today’s world, we are constantly connected to the Internet, and targeting to grow through an external network is an effective method to get traction. This traction has an unconditional value because it allows you to understand what has value and what does not. Obviously, and this is one of the big traps, I will cannot stress enough the importance of your work retention and the engagement around your service above all else. My finding is that we can very quickly end up conducting experiments that are a little too focused on acquisition. Don’t fill with water a bucket that’s condemned to stay empty. Retention is king.

--

--

Julien Le Coupanec
Hackisition

Writing about Growth Hacking, Vue & Frontend development. Formerly @_TheFamily.