We’re delighted to finally share with you a project we’ve been working on in secret for quite some time: hCaptcha.com.
What is hCaptcha.com?
hCaptcha is a way for every website in the world to create a new revenue stream from the work their visitors already do. Simply put, when website users prove their humanity via a captcha (to keep out bots and spam), website owners get paid.
Machine learning companies ask simple questions they’d like humans to answer. This creates labeled data to make their models better. Websites show their visitors the question, and check their answer to see if they’re human.
Everyone benefits: websites get a new revenue stream, visitors get to use websites with less spam and bot traffic, and ML companies get faster and cheaper labels.
Over 100 work-years of effort are spent on captchas every single day.
A captcha typically takes 3–5 seconds to solve. More than a billion captchas per day are answered around the world, and much of that effort is entirely unproductive.
The options today are to either waste those millions of human hours, or give them for free to Google’s machine learning efforts via reCAPTCHA. Did you know those “click on the car” images are all frames of video from Waymo self-driving cars?
hCaptcha democratizes this ability to ask questions into a marketplace any machine learning company and website can join.
Where did this come from?
Why did we build hCaptcha? Simple: we needed it for our own use. At IM, we do applied machine learning at massive scale: thousands of nodes processing data for billions of videos.
Our pipelines have been designed to mix machine and human intelligence with ratios adjusted over time as models improve. This means that in the early stages of deploying a new pipeline we prefer to consume millions of human labels if available at reasonable cost and volume.
hCaptcha is a solution to a problem we and many others have, and we’re delighted to share it with everyone who wants to make the world better by improving machine intelligence.
Who built hCaptcha?
You can read more on our About page, but in short:
We’re all veteran scientists and engineers. We’ve built and sold companies, worked at companies like Dropbox, Facebook, VMWare, and Cloudera, and studied at institutions like Stanford and MIT. We’ve spent decades working in machine learning, distributed systems, information security, and scaling some of the largest sites in the world.
We’re a privately held company committed to fundamental research, open source, and continued investment in the larger mission. Our team has commit privileges on dozens of open source projects, and we’re proud to have contributed significantly to codebases like Hadoop and HBase among many others.
Powered by the HUMAN Protocol
hCaptcha is pretty cool, but it’s only the first piece in the puzzle of a grand vision: enabling human beings to aid in improving machine intelligence far faster and more efficiently than ever before.
Behind hCaptcha lies the HUMAN Protocol, an open decentralized protocol for human labor that runs on the Ethereum blockchain. This has many advantages: allowing “open books” to prove we’re fairly distributing bounties, efficient micro-payments via Human Tokens (an EIP20-compatible token with a custom Bulk API), providing a novel mechanism to scale a two-sided market in a capital-efficient way, and more.
Our goal is to publish a complete open source reference implementation of all components in the HUMAN Protocol this year.
We’ll be publishing in-depth technical articles over the coming weeks and months as we open-source more components of the system for the developer community to use, and we’re excited to help you build!
We’ve been working quietly for many months on building the software, validating the ideas, and testing the infrastructure. The system is live in beta right now, and we have great customers, partners, and websites already signed up.
The hCaptcha service has already been tested at several billion requests per day, and we’re confident we match or exceed reCAPTCHA’s anti-bot protection capabilities. While we’re keeping the beta label for now, hCaptcha is already in use on real websites and labeling real customer data today.
If you’ve got a website, it’s time to stop giving your visitors away for free.
If you’re a company with labeling needs, just send us an email: we’d be delighted to work together to help you shorten turnaround, increase quality, and reduce costs in dataset creation.
Let’s Get Connected
This is an exciting day for all of us, and we can’t wait for you to try out the service, start building on top of the protocol and platform, and join us on this mission!
Let us know what you think on Telegram, Twitter, Facebook, or email. We’ll be finding ways to reward early members of our community for your enthusiasm and faith in us, and would love a way to keep in touch with you.