Hunting the “Automated Pentesting” Unicorn
There seems to be some murmurings around the corporate water cooler at the moment that are somewhat troubling. In the last few weeks I’ve had the misfortune of being invited to a conversation to discuss the idea of “automated pentesting” in three separate occasions. I can only assume that someone, somewhere, who has absolutely no idea what a pentest is, thinks that because we can teach a computer to win a game of Jeopardy, we should somehow abolish one of the last remaining remnants of intelligence from the Information Security industry and replace them with an upgraded coffee machine. My money says that this idea spawned somewhere inside the infested steaming pile of bad ideas known as the RSA Conference.
I want to tackle this idea head-on and present just a few truths that you’re going to have to accept before you consider whether or not this is even remotely possible. I’ll do my best to keep this as simple and non-technical as possible.
For the purposes of this post I’m going to assume that this “automated pentest” machine will be a piece of software…probably Java!
Before I begin, there’s one very important fundamental that non-technical people really need to understand. This is an over-simplified introduction as to how software works.
An application is just a set of instructions written by the developer the system will follow and perform calculations on. These instructions are normally a collection of functions that make decisions, then react depending on what the answer is. Computers can perform this logic with statements such as “if x happens, do y, if not, do z”. If you push the elevator call button, the controller will send the elevator to you, otherwise, it’ll send the elevator somewhere else. This can rapidly increase in complexity when additional factors are introduced such as “are you going up, or down” or “which car is closest to you” or “which car is the closest to you, that is heading in your desired direction and doesn’t already have 14 people in it.”
With this little gem of knowledge fresh in your mind, let me share with you something very few people outside of Information Security get to see. I’m going to share with you a very small overview of an average pentest engagement my team conducted last week. The team was tasked with breaking into a Survey system hosted on a web server belonging to one of our clients. We were given no information about the system, no login details, no user manuals, no diagrams, nothing more than the URL. “Can you get in, if so, tell us how to fix it.” This is a very simple web application test on the lower end of the complexity scale.
The engagement begins like any other, discovering the components of the application. The home page welcomes us with a “Foo Inc. Survey System” followed by a login form. There are “username” and “password” fields, a button named “Login” and an ironic picture of a quill on some paper. The URL ends with a “.aspx” extension so it’s a pretty safe assumption that this is running on a Windows machine. A quick banner grab confirms this with “Server: Microsoft-IIS/7.5”. We’re now sure that this is a Windows machine, most likely Windows Server 2008 R2 which gives us some idea of the capabilities and protections we could likely encounter. Looking at the source code of the page we find out it’s running ColdFusion.
For those who don’t know what ColdFusion is, it’s the ugly ginger-haired step-child Adobe, inherited when it married his hot mom, Macromedia. ColdFusion’s main function has been to provide backdoors into web servers for many years, thanks to its terrible coding and non-existent security practices. If you currently possess the authority to fire people, find anyone who runs ColdFusion in your environment and chase them out the building with an squad of Rottweilers! I digress.
Fortunately for Foo Inc., this ColdFusion deployment is actually up to date (the only one ever) and patched. Looking deeper into the code some interesting looking paths appear, hinting that the system administrator for this machine has read “The Dummies Guide to Securing ColdFusion” which means there is no way to remotely access the back-end of the system without some chronic cross-drive directory traversal bug.
The team grabs a sharp stick and pokes the login box for a few minutes but nothing interesting happens. The only observation we can find is that it is possible to find out which usernames are actually valid as the system responds differently with incorrect usernames. It would helpful if the team could see what surveys are available but none are listed on the home page. Let’s ask Google!
Here’s a free tip, when you want to find all publicly listed pages on any website, open Google and type “site:surveys.fooincfakename.com” and all shall be revealed. Pasting the URL of this survey system into Google presented us with some helpful results. Several dozen surveys were discovered as well as the “Administrators Manual” in a handy PDF download. Snatching the Admin Guide, I started trawling through it looking for anything of interest, passwords, registration links, phone numbers, whatever. The smart admins at Foo Inc. were not sloppy enough to leave login details or default passwords in the document, but several screenshots did show the email address of one of the admins in the upper right corner, let’s call him Tony Stark. Trying an incorrect login as Tony, we noted that “email@example.com” was indeed a valid user of the system and his password is not “P@ssword1”. Searching the Pastebins for his password turned up no other results either.
The Google results did disclose something of value in addition to the manual. For each survey there was a parameter in the URL that the system used to identify the survey number. This parameter was “surveyId=3356”. Changing the surveyId value to “3357” displayed a different survey. Trying other numbers in a sequence worked as expected and we now knew this value was most likely being used by the system to retrieve a survey from the database. The team threw a bunch of SQL Injection escape characters…sorry, non-technical…”evil database breaking code” at the parameter but they had no effect. This parameter did not appear vulnerable so we had to keep looking.
We continued cycling through the numbers until something interesting happened. When we typed in a survey number that did not exist, the URL got bigger, way bigger! Now the URL looked like fooincfakename.com/surveys/survey.aspx?surveyId=9999&languageId=1&testing=0&response=main&preview=1. This gave us some new parameters to give to the evil database breaking code. We set the evil hacker code scanner against the machine and several minutes later, the words “The parameter ‘preview’ appears to be injectable” appeared on the screen.
Finding SQL Injection on a Windows box in 2016 is like winning $100 off a $2 scratchy card, satisfying, but not worthy of smooching the lady behind the counter. SQL Injection is a weakness which allows a troubled Russian teenager to speak directly to the database behind the web application. It can yield a lot of juicy information about the application, however, on Windows machines it’s very unlikely that it’ll give you exactly what you want without an expensive dinner and a movie first. To make matters more complicated, this injection was ‘Boolean-based blind’, meaning it does eventually give you the information you requested, but it’s painfully slow. It’s comparable to when you ask your SO “What’s wrong?” and they respond with “Nothing!” You step methodically through a list of things that you possibly did wrong, each one met with a “No” until eventually you get silence and then you know what they’re mad at you about. Blind SQL Injection is just like that.
With this newly discovered access to the Foo Inc. database, it was time to start snooping around. SQL Injection let’s you play 20 questions with the database so we started the game:
What is the current database I’m in? FooInc360ClientSurvey.
What user are we running as in the database? ColdFusion.
Will you run Windows commands for me? No! Get Out! Creep!
How many tables are there in this database? 481. (That’s way too many to try and extract blindly, it would literally take decades, which is expensive)
Is there a table called “Users”? No.
Is there a table called “Logins”? No.
Is there a table called “Passwords”? No.
Is there a table called anything resembling the word Users, Logins or Passwords? Lol, No.
Things are not going well and I’m seriously questioning if this is even a real database. We were trying to locate a table that contains all the user accounts who are allowed to log into the Survey system from the home page. Normal developers make this easy by naming these tables “Users”, however, in this case they decided that would be way too easy for us. We needed to find anther way to locate the table that contained the list of login details.
That’s when I remembered my good friend Tony Stark, from the Admin Manual.
How many cells contain the contents “firstname.lastname@example.org”? 5.
Ah ha, now we’re getting somewhere.
What are the 5 column names of those values? “360usernm” & “360email”.
What tables contain Tony’s email and have the column “360usernm”? “FooInc360Usrs”.
Show me the columns in table “FooInc360Usrs”.
The database splits out a bunch of column names and one that looks like might contain passwords, “360psswd”
Show me the value of “360psswd” in the same row as Tony’s email. The database responds with “P@ssword2”
Damn, so close, oh well, we have his password now, and it wasn’t even encrypted. We log into the system as Tony and start looking around. Ideally we’re looking for somewhere to upload or download files which will enable us to start learning more about the system or begin running commands. Several hours later and the closest thing we have is in the “Template Manager” which allows you to change the icon in the top left of a survey template.
The team initially overlooked this as the system would reject anything uploaded which was not an image file, until we noticed something odd. It wasn’t possible to upload non-image files, but it was possible to upload image files that don’t contain a recognised extension. JPEG, GIF and PNG are all accepted file types, however if we uploaded a JPEG but renamed it to “logo.jpg.aspx” the image would upload just fine. This was going to be our way onto the machine.
File types can usually be identified by a few bytes in the beginning (header) and the end (footer) of a file, with the middle generally containing the data. We opened an image in a text editor, pasted some of our code into the middle of the file, saved and re-uploaded it to the server. “File upload successful.” Awesome, the image seemed to upload fine, it just didn’t display the actual image because we broke it after shoving our own code into it. A prompt right-click and “Open image in new tab” later, we could see a box asking us to enter in a command. We mashed the “ipconfig” command into the box and it happily spat out the system’s network configuration to the screen. We now had code execution on the server and could escalate the attack to the next level.
Looking back through the above wall of text, something becomes immediately apparent. Technical people and developers would have picked up on some of the challenges but let me break it down and see what we can start to code into the new “Asimo Hacker 9000 Automated Pentest Machine”
Step 1: Find the SQL Injection bug.
None of the client provided information about the site revealed any links to vulnerable pages nor any of the parameters. We had to utilise search engines to find anything beyond the front page. Once we had identified some content, we had to generate an error to get to the error page and then test each parameter individually until we found one that had a flaw. To automate this step alone, while not impossible, is massively improbable and will be fraught with false positives, re-scans, search engine blacklisting and months worth of scanning on each and every parameter.
Step 2: Find the table that had user credentials.
This step is normally very easy and has largely been automated in the past. The wildcard here is that developers can name the database anything they want. When we checked this database against the top 3000 most common table names we got 4 hits. It wasn’t until we fed in the email address from a screenshot into a custom database query that we got a result, 5 results actually. Filtering those down to the correct table was yet another step. This was then followed by hunting for the column which contains the passwords. I’ll concede that this step might be possible to automate, but it’ll be incredibly unreliable and the time taken to sift through that much blind data isn’t feasible within modern testing timeframes.
Step 3: Upload the shell.
Once you have found credentials and logged into the system, finding a location to upload anything is trivial. Most tools used by testers nowadays can discover every page on a site within minutes and modern attack tools will point out all the pages which allow file uploads. Discovering what restrictions apply to uploads is a little more challenging but still well within the realm of automation. These steps have already been scripted into tools like Burp, Zap and a few others.
In certain controlled environments, it may be possible to automate all the steps outlined above, but this is still nowhere near to automating pentesting. It’s purely automating one tiny step of a much larger project. Consider that this is only one possible technique of one small part of one system of a dozen targets from one single engagement. The sheer number of options that a tester has, at each step, with information gathered, in a relevant order that leads to a successful compromise is just biblical.
Additionally, pentesters use information gathered from multiple sources, chaining together several seemingly harmless weaknesses to construct a sledge hammer of an exploit with which to beat the target into submission. This is the ‘art’ of pentesting. It’s not purely technical, it’s not a “do this, do that” methodology, it’s a way of thinking, an in-depth understanding of the intricacies happening behind the scenes.
Discussing “automated pentesting” in 2016 is within the same realm as talking about Artificial Intelligence. AI is something the best minds in the world have been working on for decades and we’re still no closer than a chess match and cleverbot. It’s theoretically possible that eventually, it may one day become possible.
This industry is made up of exceptionally smart people, many of whom are consistently outsmarting the best countermeasures designed by the largest organisations. I genuinely think a lot of people have forgotten that. Reading research papers, blogs, attending conferences or looking at the code in kernel exploits, there is a level of comprehension so far beyond what most people are able to understand.
With so many security researchers out there building tools, writing scripts and compiling password lists the speed and accuracy at which a modern pentester can get through an engagement is several magnitudes faster than just a few years ago. Trust me, if there is a faster, easier way discovered to do anything related to security testing, it immediately gets built, distributed and continuously improved upon by the creator and the community.
There really is no point asking if pentesting can be automated, because it already has been. What you’re asking for is whether skill, knowledge and intelligence can be automated, and that’s just not yet possible.