Why Nerds Rule: Luis Von Ahn and reCAPTCHA
By Twisted Sifter on Tuesday, June 16, 2009 filed under TECH & GADGETS VIDEOS.I stumbled across this video on the weekend and was floored by the idea presented. It’s brilliant, simple, and effective. A true testament to the notion that small contributions can make a BIG difference.
*UPDATE: September 16, 2009 @ 9:20am Google announces they have acquired reCAPTCHA. Official Google Blog post here
For those who can’t spare 12 minutes for the brilliant and entertaining talk above, here’s the gist:
THE PROBLEM: Spammers
Free email services like Google, Yahoo!, and Microsoft were suffering attacks from hackers/spammers who had written programs to obtain millions of email addresses every day. Why did they need so many email addresses? Because these free services only allowed users to send a specific amount of emails per day (e.g., Yahoo only allowed 100), so in order to effectively ’spam’ they required numerous addresses.
THE SOLUTION: CAPTCHA
Develop a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot. For example, humans can read distorted text as the one shown below, but current computer programs can’t.

This is an example of a typical CAPTCHA
In 2000, Luis von Ahn and Manuel Blum coined the term ‘CAPTCHA’. They invented multiple examples of CAPTCHAs, including the first CAPTCHAs to be widely used, which were those adopted by Yahoo!.
THE REVELATION
- Approximately 200 million CAPTCHAs are typed every day around the world
- Each CAPTCHA takes nearly 10 seconds of time and thus;
- 500,000 hours of human time are wasted every day typing CAPTCHAs
THE CHALLENGE
Is there any way this human effort can be used for the greater good of humanity?
THE SOLUTION REVISITED: reCAPTCHA
- Digitizing books one word at a time. reCAPTCHA is a free CAPTCHA service that helps to digitize books, newspapers and old time radio shows

How it works
In an effort to make information more accessible, book pages are being photographically scanned, and then transformed into text using “Optical Character Recognition” (OCR). The transformation into text is useful because scanning a book produces images, which are difficult to store on small devices, expensive to download, and cannot be searched. The problem is that OCR is not perfect.

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. Each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.
But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle?
Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct. BRILLIANT!
FYI: With the assistance of reCAPTCHA, the entire New York Times archive from the 1850’s - 1980’s will have been completely transcribed in less than 12 months.

AP Photo/Gene J. Puskar
Luis Von Ahn
Graduating from Carnegie Mellon with a Ph.D. in Computer Science in 2005, Von Ahn is now a professor at his Alma mater. When he’s not lecturing about the Science of the Web he’s working on Human Computation, which harnesses the combined computational power of humans and computers
to solve large-scale problems. Some call this “crowdsourcing.”
His 8-page C.V is quite impressive and his list of accomplishments will only grow as he continues his research. Some of his ’selected honours’ include:
- MacArthur Fellow, 2006-2011.
- Discover Magazine: 50 Best Brains in Science, 2008.
- Silicon.com: 50 Most Influential People in Technology, 2007.
- Microsoft New Faculty Fellow, 2007.
- Sloan Fellow, 2009.
- Smithsonian Magazine: America’s Top Young Innovators in the Arts and Sciences, 2007.
- Technology Review’s TR35: Young Innovators Under 35, 2007.
- IEEE Intelligent Systems “Ten to Watch for the Future of AI,” 2008.
- Popular Science Magazine Brilliant 10 Scientists of 2006.
You can find his personal blog here, and his university page here

Sources:
- http://scienceoftheweb.org/
- http://www.cs.cmu.edu/~biglou/CV.pdf
- http://recaptcha.net/
- http://www.captcha.net/
- http://recaptcha.net/reCAPTCHA_Science.pdf
- http://en.wikipedia.org/wiki/Captcha
- http://www.cra.org/ccc/index.php
THE POWER OF PEOPLE

![]()
![]()
![]()
If you’re on Twitter or Facebook, let’s connect!
If you enjoyed this article, the Sifter highly recommends: A Brief Introduction to 3D Printing and Scanning with Jay Leno



first
June 23rd, 2009 at 1:02 amFirst!………but really, this is an awesome idea!
Twisted_Sifter
June 23rd, 2009 at 1:55 amMore awesome than this comic strip? Barely…
<img src="http://myapokalips.com/public/cartoons/021_Robot_Tattoo.png">
jeans
June 28th, 2009 at 9:16 pmyeah, it's great technology, but is it really best to use it to chronicle such a biased, fact-bending, mind-control rag like the NY Times?
Twisted_Sifter
June 28th, 2009 at 9:36 pmLol.
Besides the New York Times, reCAPTCHA seems to also help digitize the Internet Archive so it's not strictly for mind-control purposes
pants
July 20th, 2009 at 6:50 pmmind control? WTF are you talking about? please provide some of your conspiracy theory ideas here so we can judge….
Mackeran
August 20th, 2009 at 3:05 pmThank you! You often write very interesting articles. You improved my mood.
Mackeran
August 21st, 2009 at 3:48 pmInteresting and informative. But will you write about this one more?
Cornelius
August 23rd, 2009 at 5:07 amAre you a professional journalist? You write very well.
Twisted_Sifter
September 16th, 2009 at 9:47 pmUPDATE: On September 16, 2009 9:20am Google announced that they had just acquired reCAPTCHA. It will be interesting to see what they plan on doing with this acquisition but surely it will help with their current initiative to digitize books.
Official statement can be found here: http://googleblog.blogspot.com/2009/09/teaching-c...
HeatherO
September 20th, 2009 at 6:33 pmwow! very cool! I thought the only "other thing" captcha's were being used for were new and creative names for pornbot's on twitter! LOL
Twisted_Sifter
September 20th, 2009 at 7:30 pmhaha. What's a funny one you have heard?
Gil_Uli
September 21st, 2009 at 3:55 amThis guy make me proud of be called guatemalan (or chapin)…
radialmonster
October 21st, 2009 at 10:46 pmSorry Pants, but reCAPTCHA is not contributing to the Internet Archive at all.
hyle
December 18th, 2009 at 10:10 pmsimple and brilliant at the same time.
the karin
January 6th, 2010 at 10:12 pmWow, that is… amazing. Humanity is not yet lost, after all. Now, onto foreign languages. The world probably does not revolve around the New York Times and the Western cannon.
(Just had to add the extra task.)
loopylove
January 26th, 2010 at 1:37 pmThat was a really interesting post, I enjoyed reading it. You are dead right!
rich-man
February 9th, 2010 at 1:00 pmFab article, some very interesting posts
divorced-man
February 12th, 2010 at 12:45 pmGreat post, You are so right!