PHP Dawg by Fabricio Rosa Marques

PHP SSRF Techniques

How to bypass filter_var(), preg_match() and parse_url()

Mar 1, 2018 · 7 min read

A few days ago I’ve read two awesome papers: the first one, published on, “A New Era of SSRF ” that talks about SSRF on different programming languages, and the second one is a beautiful paper by Positive Technology named “PHP Wrapper” on how to use PHP wrapper in order to bypass filters and input sanitization in many different ways (you can find both linked below).

In this article, I want to go deep on a few SSRF techniques that you can use against a PHP script that use filters like filter_var() or preg_match() and get HTTP contents using curl or file or file_get_contents().

According to OWASP:

In a Server-Side Request Forgery (SSRF) attack, the attacker can abuse functionality on the server to read or update internal resources. The attacker can supply or a modify a URL which the code running on the server will read or submit data to, and by carefully selecting the URLs, the attacker may be able to read server configuration such as AWS metadata, connect to internal services like http enabled databases or perform post requests towards internal services which are not intended to be exposed.

PHP vulnerable code

All my tests are done using PHP 7.0.25 (maybe when you’ll read this post it’ll be outdated, but all described techniques should work anyway):

Following, the PHP script that I’ll use for tests:

As you can see, the script gets an URL from the first argument (it could be $_GET or $_POST in a web application) then it checks the URL with the filter_var() function in order to validate the URL format. If it’s ok, it parses the URL with parse_url() and it checks if the request hostname ends with with a regular expression using preg_match().

If all it’s ok, the script make an HTTP request in order to get the target web page using curl, and print_r() in order to show the response body.

Expected behaviour

This PHP script should accept requests for hostname only, and reject all other targets. Let’s give it a try:

Until here, all sounds good. The first request to has been accepted and the second one to has been refused. Security level: 1337+ :)

Bypass URL Validation and Regular Expression

As you can see in my ugly PHP code, the regular expression check if the request hostname ends with This could seem hard to elude but if you know well the URI RFC syntax, you should know that semicolon and comma could be your secret weapon in order to exploit a SSRF on the remote target.

Many URL schemes reserve certain characters for a special meaning: their appearance in the scheme-specific part of the URL has a designated semantics. If the character corresponding to an octet is reserved in a scheme, the octet must be encoded. The characters “;”, “/”, “?”, “:”, “@”, “=” and “&” may be reserved for special meaning within a scheme. No other characters may be reserved within a scheme.

Aside from dot-segments in hierarchical paths, a path segment is considered opaque by the generic syntax. URI producing applications often use the reserved characters allowed in a segment to delimit scheme-specific or dereference-handler-specific subcomponents. For example, the semicolon (“;”) and equals (“=”) reserved characters are often used to delimit parameters and parameter values applicable to that segment. The comma (“,”) reserved character is often used for similar purposes.

For example, one URI producer might use a segment such as name;v=1.1 to indicate a reference to version 1.1 of “name”, whereas another might use a segment such as “name,1.1” to indicate the same. Parameter types may be defined by scheme-specific semantics, but in most cases the syntax of a parameter is specific to the implementation of the URI’s dereferencing algorithm.

If used on hostname, for example; it could being parsed by curl or wget like hostname: and querystring: Let’s try:

The filter_var() function could parse many types of URL schema. As you can see, filter_var() refuse to validate my requested URL with semicolon on hostname and “HTTP” as schema. But, what if I change the schema from http:// to something else?

Yeah! ok: both filter_var() and preg_match() bypassed, but curl can’t get page yet… Why? Let’s try to use a syntax that don’t let ; be parsed as a part of the hostname, for example by specifying the destination port:

Bingo! As you can see, curl try to get now! The same behavior occurs using a comma , instead a semicolon ;:

Elude URL Parsing function and SSRF

parse_url() is a PHP function that parses a URL and returns an associative array containing any of the various components of the URL that are present. This function is not meant to validate the given URL, it only breaks it up into the above listed parts. Partial URLs are also accepted, parse_url() tries its best to parse them correctly.

One of my favorite techniques in order to bypass regular expression in a scenario like this is to convert a part of the string into a variable. This work when the result is evaluated by Bash. For example:

With this technique I’ve make bash to parse $google as an empty variable and curl to request evil<empty>.com. Cool, isn’t it? :)

This happens just into the curl syntax. In fact, as shown in the screenshot above, the hostname parsed by parse_url() is still evil$ The $google variable is not being interpreted yet. Only when with exec() function the script uses $r['host'] to make a curl HTTP request, Bash converts it to an empty variable.

Obviously, this work just in case the PHP script uses exec() or system() function to call a system command like curl, wget or something like it.

Wrapper data:// and XSS for the win

Another example PHP code using file_get_contents() instead calling curl from system() or exec():

As you can see file_get_contents() uses the raw argument variable after validating it with the same technique described before. Let’s try to modify the response body by injecting some text like “I Love PHP”:

Not allowed :( parse_url() set text as request host, and it correctly reject it for “not allowed host”. Don’t despair! There’s a thing that we can do, we can try to “inject” something into the mime-type part of the URI… because, in this case, PHP doesn’t care about mime-type… who cares?

From here to XSS is a piece of cake…

That’s all folks, so long and thanks for all the fish!


Twitter (en):
GitHub (en):
LinkedIn (en/it):
Rev3rse Security (it):


Positive Technologies: “PHP Wrappers”
Orange Tsai: “A new era of SSRF”

We want more, we want more…


secjuice™ is your daily shot of opinion, analysis & insight…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store