GSOC 2017 with Nmap Security Scanner

Before we go any further, I thank each and everyone who helped me in my way to achieve this. But for this document I would like to limit them to my mentors Daniel Miller and Fyodor from Nmap for choosing me over several other applicants from all over the globe and for guiding me through the whole process, providing me invaluable resources at times of need and for being so supportive throughout my internship.

This article is divided into following sub-sections.

  1. Hitting if off with Nmap.
  2. Application period poding.
  3. Official coding starts.
  4. My GSOC codebase. (Open this to directly access my contributions list)
  5. Things I learned through GSOC.

Hitting it off with Nmap

It was in March 2017, I was told by one of my seniors regarding Google Summer of Code. I’m a penetration tester and security enthusiast (Web application security). So, without any delay I started to see the list of organizations which work on security. There were only a total of 6 organizations and 5 of them caught my interest. They were Metasploit, Nmap, Tor, Honeypot and Radare2. I thought of applying for at least 3 of them and started to do my research on each of them. After a day or two, my inner soul started saying, “You were using Nmap for the past 2 years and its now the time for you to show tribute to Nmap.” Without any second thought I made my mind to contribute to Nmap leaving all the other options behind.


Application period coding

I realized that I’m far away from my peers in terms of knowledge with Nmap codebase and other things related to Nmap. Initially I was a bit worried when I went over to the IRC channel(#Nmap) and found that many of them started their GSOC preparation from Dec’16 and I just started 10–15 days before the deadline. Nmap has its own Scripting Engine known as NSE(Nmap Scripting Engine) for writing the scripts which is written in LUA. I never heard of this language before and on top of that I was left with very few days, still I didn’t loose hope and since I’m a quick learner I was able to learn the language faster than anyone can. But just going through a pair of tutorial websites quickly doesn’t mean you are good with the language. So, to convince myself that I’m good at LUA, I developed a small project in LUA within a day. My LUA project automatically converts your Mozilla browser into a hackers tool kit by installing all the required add-ons.

Once I was familiar with LUA, I started to read the Nmap codebase. Nmap isn’t a single tool by itself. It has several other tools like Ncat, Nping, Ndiff and Zenmap. One can contribute to any of these 5 tools through GSOC.

The Nmap developers on IRC channel were very helpful. Once I found myself comfortable with the codebase I checked the issues page and picked an issue which matched my interest and I made my first PR. Later on, I felt that going only through the Github Issue Tracker will not help me to write great scripts. Since I’m familiar with Nmap usage, I knew the pros and cons of the available scripts. While I was doing my pentest during my internship, I needed few scripts and those were missing in Nmap. I felt like implementing them would help other hackers during their pentests. This way I started to contribute to Nmap.

Few of my PR’s were merged with Nmap very quickly. Yayyy…. felt very happy from inside since I made Nmap better. This incident increased my respect towards Open Source and Nmap.

I started being active on the IRC channels and started making PR’s on Github. I submitted my GSOC application two minutes before the deadline. Ufff….. felt like I was just re-born because Google doesn’t extend the deadline at any cost.

The results were compiled in a month, in the meanwhile I got myself familiar with Nmap completely and started contributing. In that one month I got invaluable help from Open Source community and Nmap developers which improved my coding abilities and familiarity with the Nmap codebase.

To be frank, I totally forgot that I applied for GSOC and I was immersed in contributing to Nmap and I felt proud from inside since I was able to help other hackers by improving Nmap source code. I was submerged in contributing to Nmap, I got a call from my senior at around 23:00 on May 4 saying I got selected into Nmap. Hurrah…..!! I was on cloud9 since I got the chance to work with Nmap through Google.


Official coding starts

Me, screwed up with decoding SMB responses (Binary data).

I got myself active on GSOC’17 Whatsapp groups, Facebook groups, Telegram channels, Slack channels, IRC’s and so on. I got a chance to meet the best talent from all over the world. I now have connections all over the globe.

I got some resources from my mentor which I had to go through before I started actual coding. Since I was anxious, without asking my mentor’s permission I took the chance to work on the ideas I submitted in my GSOC proposal which I later on found was one of the biggest mistakes I had done. Read the next two sections to know why.

$ autocomplete feature for nmap

Previously, all the developers have to type the commands completely to run their scripts. There are hundreds of scripts and each script has tens of arguments which no one can ever master. So, what happened was every hacker has to check the script file or documentation for using Nmap. I felt like it would be cool if we can provide a feature which can autocomplete the args upon double tapping [TAB]. I noticed that my mentor has a private repo which provides this functionality but it fails to autocomplete the inner arguments which a script expects the user to provide. So, I thought adding that feature would be cool and I started coding it right away without his permission.

I made a PR #4 which closed the issue #1 as well in the private repo.

$ colored output for nmap

Screenshots of colored output for easy debugging purposes.

There are several options in Nmap for debugging purposes. But these debugging messages can’t be easily identified by a n00b. So, enabling this feature will increase the user readability by making it easier to examine the detailed output along with debug info.

I was very happy thinking that both the above PRs will be merged with Nmap master branch. But sadly, both of them were turned down by my mentor, saying that this autocomplete feature and colored output will work great on Linux(bash) only, and might fail on other shells like zsh, ksh and so on. Most importantly both the above mentioned features will not work on Windows due to lack of support. I felt very bad that I should have discussed about implementing these ideas with my mentor on before hand. So, these PRs were rejected from being merged into the master branch but can be maintained in a private repo which I’m planning to release soon.

Until now I was trying to make enhancements which would help a n00b hacker but then I felt like making something useful for everyone.

$ refactoring http-enum script

The http-enum script is structured in a very old fashion. I had discussions with my mentor for more than 2 weeks regarding the changes to be made. I created a report which explains the shortcomings, new features and the improvements that can be made to optimize the script.

Report is available on Google docs,
https://goo.gl/ubkV2E

Since we were hit with some other important tasks to complete and issues on the tracker were increasing rapidly, we thought of continuing this enhancement later on.

$ added missing ip protocols to netutil.cc

Some important IP protocols were missing from the list in netutil.cc which are being widely used. I added the important, missing protocols to the existing file. Nmap already has a mapping of protocol numbers to names, based on IANA’s assignment registry, namely the nmap-protocols file.

Merged proto2ascii_case and nexthdrtoa functions which returns protocol name. Removed proto2ascii_lowercase function by writing a modular code for one of the existing function.

I wrote a shell script to generate the code from the nmap-protocols file. This will be more efficient than reading and parsing the file at run time and will also make libnetutil not dependent on an external file at run time. This is important for things like Ncat and Nping that are not usually packaged with the nmap-protocols file.

$ fixed issues related to cve-2014–3704 nse script

There was an issue #902 related to malfunctioning of the cve2014–3704 script. That was a very typical issue. When this script was executed against a vulnerable server, a false output was generated. I tried exploiting the vulnerable server using Metasploit, it was successful but when I tried to exploit it using the Nmap script there was a failure. I compared the code of Metasploit and Nmap scripts and both looked like same.

I thought there might be an issue with the way the packets were sent by the POST request in Nmap. I started to intercept the packets of both Metasploit and Nmap separately while exploiting the server using Wireshark. I observed that Nmap packets were crafted in a different format compared to Metasploit. I tried very hard for one complete week trying to figure out why the post data sent by both the tools are different. After a week, I got a hacker on Metasploit IRC channel answering my question and I resolved the issue with sending request.

Later on, I found that Drupal is having two login pages and only one of them is vulnerable and both the URL’s are different by just a single character and I felt bad that I should have observed that in the first place. I changed the end points of the target server in Nmap and everything was in place.

$ enhancement made to cve-2014–3704 nse script

After we attack the server, we have to check for traces of compromised instances but this is not properly done by existing Nmap script. I don’t say its entirely wrong but it just doesn’t work in all cases. For example, if the vulnerable website is present in Spanish or French then the existing script cannot return correct output regarding the vulnerability. I added new conditions which checks for multiple traces of compromise and then confirms the vulnerability.

$ wrote script for OpenWebNet protocol discovery

OpenWebNet Protocol is a communications protocol developed by Bticino since 2000. This protocol is mainly used for Home Automation purposes. This protocol is widely used yet there is a scarcity of good documentation for its usage. There are only 2 websites existing, which helped me in writing this script.

This new script can get the information like IP Address, Net Mask, MAC address, Device Type, Firmware Version, Server Uptime, Date and Time, Kernel Version, Distribution Version from the target. Apart from that it can retrieve number of automated devices, number of lights, number of burglar alarms, number of heating devices and so on.

So, it will be a very useful script during enumeration of home automated appliances as it is capable of fetching so many details from the target.

$ removed redundant parsing functions by making enhancements

Previously two functions from two different libraries were used for parsing the websites. Each of the functions had some extra functionality wrt others but shared code to some extent. I combined them into one function. This particular commit is a good cleanup of redundant code.

$ developed punycode and idna libraries for nmap

The existing Nmap crawler can send requests to domain names only in English.

If you try to crawl websites like “http://點看.com” or “http://योगा.भारत” you get an error saying the URL cannot be decoded.

I created Punycode and IDNA libraries with functions which let you encode and decode these kind of stuff. For example,

“http://點看.com” gets decodes into “http://xn — c1yn36f.com”.

http://योगा.भारत” gets decodes intohttp://xn--31b1c3b9b.xn--h2brj9c”.

Crawlers can now crawl the website by using the decoded URL’s.

Developing these libraries had been very challenging part. I never had the habit of reading technical papers from the first line to the bottom line. I had to read the papers thoroughly to write proper functions.

Completing this task requires high dedication because I had to read papers like UTS #46, RFC 3490, RFC 3491, RFC 3492, RFC 3493, TR#46, TR #9 and then analyze the data provided in all the above technical papers and then create the respective functions.

The real part comes here, I completely coded as mentioned in the technical standards and the code doesn’t work properly as expected. All cases were successful expect one or two and then comes the highest level of depression. I didn’t have the balls to find the bug in the code because I wasn’t ready to go through 1500 lines of code, I had to go through the research papers again and I had to cross check the code with the procedure as mentioned in the papers. Finally, after so much struggle I successfully developed the libraries.

At this point, I was fatigued working only on web related stuff and felt like needing a change of pace. I thought of working on something which has nothing to do with web and following are my achievements in other parts of Nmap.

$ ncat enhancement - limit data using a delimiter

Ncat is a reimplementation of the currently splintered and reasonably unmaintained Netcat family. Ncat can act as either a client or server, using TCP or UDP over IPv4 or IPv6. SSL support is provided for both the client and server mode.

Till now I was working on LUA since the starting of the GSOC and for the first time I was hit with an idea of working with something different apart from Web and LUA. This enhancement was purely implemented on C language.

This will be a good enhancement to the existing Ncat once the PR gets merged. Functions were added to accept delimiter as a parameter through command line for delimiting the data before sending it.

At this stage I was like, my GSOC is about to complete and I haven’t done anything new. I was just getting better at what I already knew. I wanted to learn and contribute more. I wanted to work on something totally new to me.
I selected to work on Windows SMB related stuff. I didn’t even know what SMB meant until this point. I requested my mentor to allow me to work on this new area. I was left with only 10 days before I started to work on SMB.

$ script to fetch smb enum services from remote windows machine

Random capture of response sent by Windows

SMB protcol is used by Windows. I had written script that fetches the list of services running on a remote windows system along with their service status message. This has been the most challenging part in my entire GSOC stage.

I had less than 2 weeks to complete this task and I just started learning the definitions of SMB, CIFS and so on. Once I was done with understanding the definitions, I tried to write the code. But I wasn’t familiar at all with protocols like SMB, DCERPC, SVCCTL and so on. We can’t send a GET/POST request directly as we do on protocols like HTTP/HTTPS. Its a new game.

I found a software created by windows to list the services, psservices.exe. I intercepted the packets sent by psservices.exe. I thought I could easily replicate the requests being sent based on this data.

After seeing the capture I was like WTF is going on. I understood how the requests were being sent but I didn’t know how to code it. I got some useful resources from my mentor, then I figured out a way to establish connection. Finally, wow.. I made the connection and the response looked something like this.

Debugging output which shows a part of request and response sent to the server
Piece of response data received from the Windows server after making a svcctl request

I was so happy as I got the response from server and I thought its almost done.

The response is binary data and unmarshalling or decoding the binary data was very tricky and challenging task in my whole GSOC period.

After a long struggle for two days, the binary data decoded was unmarshalled and it displayed all the services with their service name, display name and service status respectively.

Taking up this very challenging task with a narrow deadline gave me immense pleasure of participating in GSOC and I did learn many new things related to Windows protocol and decoding binary data.


My GSOC codebase

This section contains the links to the work I have done so far through GSOC.

The below data is taken on Sep 28, 2017.

Merged commits: Total 23

Opened PRs: Total 12

My Closed PRs: Total 21


Things I learned through GSOC

  • Learnt the kind of background work needed to be done before starting to write the main code.
  • Collaborate with highly experienced people remotely.
  • Breaking the code into components at each stage. In other words, learned writing modular code effectively.
  • Effectively reading technical and research papers.
  • Techniques to be followed while refactoring existing code.
  • Writing scalable code. Experience with openwebnet-discovery protocol script taught this point very well.
  • Improved my exploitation script writing skills.
  • Quickly switch the tasks over and over with ease.
  • Gained knowledge on the SMB protocol and unmarshalling hexdump.
  • Analyzing transfer of packets over wireshark to find the errors.
  • Networking.
  • Don’t dive into new areas directly without any prior knowledge.

Finally, a quote from The Pragmatic Programmer book that inspired me -

“Writing code that writes code” differentiates the best from rest and I did this throughout my internship period ;)