Database Lookup Services & Magical Girls

obnoxious
obnoxious
Published in
3 min readApr 26, 2017

Many of you may be familiar with LeakedSource, LeakBase, Snusbase and many others. LeakedSource for the longest time was the golden standard for database lookup services and I wanted to beat it. Unfortunately at this time they were no longer up so I couldn’t get any timings on how long lookups take. So I made LeakBase the golden standard for my project.

https://twitter.com/publicdbhost

These stats are provided by PublicDBHost / cuck @ greysec. Snusbase seemed to offer the fastest. He is doing what I do with a different DBMS and much much larger server. So with my small 16GB RAM and 8core server we’re going to create Leakbase. I have about 500 million records in it currently and serve it entirely for free at about 300MS at most for a no-limit record search. I also do not currently have wildcard due to some issue I didn’t care enough to finish. But lets gets into the project of magical girls.

Below you can see a 188MS lookup, this is faster than all the results above but however I only have 500M records I can see it being around 200–250MS with 2–3.2B.

188MS lookup

DBMS — Database Management Systems

Selection

NoSQL by nature is built to scale and personally I find MySQL, MariaDB and whatever other fork disgusting so I went with MongoDB. They also provide an extremely versatile indexing. Documentation available here.

Structure

I decided against putting every DB into it’s own collection because of the nightmare of indexing it all. Also because it’s harder to find() and the only advantage I gain would be removing a database from my results instantly instead of having to find the records first. So all records are in the collection data.

Web App

I’m not sharing my source for this because this is beyond simple and I don’t want to spoon feed hard. But I use the PyMongo library and authenticate and proceed to take the content GET parameter. With that information I do:

data.find({“$query”: {lookup[‘type’]: lookup[‘data’]}, “$maxTimeMS”: 10000}).collation(Collation(locale=’en’, strength=1))

Displaying Results

I generate basic tables using Python to display through Flask / Jinja2

This can done better but I just hacked it together.

Querying and result speed

There is a good chance you’re here wondering how I got results to be as fast as I did, this is simple and done with indexes. I researched fulltext indexes and normal indexes for awhile to figure out what I finally needed and I needed a fulltext index for the username, email and just a normal index for IPs.

I ended up with invalid UTF-8 in my records some how and I didn’t want to reimport 500m records so I just went with collation indexes for the username and email.

db.data.createIndex({“username”: 1}, {collation: {locale: “en”, strength: 1}})
db.data.createIndex({“email”: 1}, {collation: {locale: “en”, strength: 1}})
db.data.createIndex({“ip”: 1})

Conclusion

TL;dr use indexes and this is no where near a problem for people that know how DBMS even kind of work. Thanks for reading!

--

--

obnoxious
obnoxious

pythonista, system administrator and avid looper of HTTP requests