As an avid reader of HAD I was intrigued by this post explaining how someone had broken MintEye’s audio based CAPTCHA. The image version of the CAPTCHA looked interesting and so I thought it might be fun to try and break it.
For those unfamiliar with MintEye, the image based CAPTCHAs look as follows:
You must adjust a slider to select the undistorted version of the image. Several somewhat naive approaches (in my opinion) were proposed in the HAD comments to solve this captcha, based on looking for straight lines. However, such solutions are likely to fall down for images containing few straight lines (e.g. the CAPTCHA above).
After a little thought (and unfruitful musings with optical flow) I found a good, robust and remarkably simple solution. Here it is:
import cv2 import sys import numpy as np import os import matplotlib.pyplot as plt if __name__ == '__main__': for dir in range(1,14): dir = str(dir) total_images = len(os.listdir(dir))+1 points_sob = [] for i in range(1,total_images): img = cv2.imread(dir+'/'+str(i)+'.jpg') gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) sob = cv2.Sobel(gray, -1, 1, 1) points_sob.append(np.sum(sob)) x = range(1,total_images) res = np.argmin(points_sob)+1 print res plt.plot(res,points_sob[res-1], marker='o', color='r', ls='') plt.plot(x, points_sob) plt.savefig(dir+'.png') plt.show()
(Note the majority of the code is image loading and graph plotting. Automatically fetching the images and returning the answer is left as an exercise for the dirty spammers)
The theory is this: the more you ‘swirl’ up the image, the longer the edges in the image become. You can see this in the example above, but a simpler example is more obvious:
See how the length of the square box clearly increases? To exploit this, we want to sum the length of the edges in the picture. This simplest way of doing this to to take the derivative of the image (in the Python above, by using the Sobel operator) and sum the result (take a look at the Wikipedia article to see how Sobel picks out the edges). We then select the image with the lowest ‘sum of edges’ as the correct answer.
The results are excellent as you can see below. 100% of the 13 test CAPTCHAs I downloaded were successfully solved. The following graphs show image number on the x axis and ‘sum of edges’ on the y. The red dot is the selected answer:
An interesting feature is that the completely undistorted image is often a peak in the graphs. This means we usually select one image to the right or left of the correct image (which is still happily accepted as the correct answer by MintEye). This seems to be because the undistorted image is somewhere sharper than the distorted images and hence has sharper gradients resulting in larger derivative values. Obviously it would be trivial to do a local search for this peak, but it isn’t required to break this CAPTCHA.
In conclusion, it would seem this method of image based CAPTCHA is fundamentally flawed. A simple ‘swirl’ operation will always be detectable by this method, no matter the image being swirled. The increased sharpness also gives the game away – an FFT or autocorrelation could easily be used to detect this change in sharpness, just like autofocus algorithms.
Please post the Visual Basic codes for this. The language you post in the article is not Visual Basic.
Thank you.
Why do you want it in vb, vb is not the right language to do this kind of thing and the article says “in python”, so I think he’s aware it’s not VB
yes, RTFT (last T for title!)
You’re Visual Basic.
Best comment ever!
Agree!
+1
So far I know, this would not be possible to do in VB ‘this’ way. As the author is using OpenCV (import cv) and there is no VB port AFAIK.
I’m not saying it would be impossible in VB, but it would be a bit complicated and resulting code would be quite slow.
/sarcasm
Zhou is either very misguided or a comic genius. Either way I approved his comment ‘for the lolz’!
Don’t know if troll.
Apparently the spammers want their attack code in VB these days. Who knew?
I hope no one rewrites this in VB
What a comedic genius. I don’t think I’ve laughed this much at a blog comment in a long while. Thank you, Zhou.
I’m wondering, can you link us a few of your other comments or some of your work or your website? I hope they’ll be of a similar calibre!
I would love to see VB or C++ code myself. As a beginner programmer it would be for me easier to understand.
You can learn enough python to understand what he’s doing here in about an hour if you go slow. Its just a bunch of calls to OpenCV.
You also get the added benefit of knowing python, hint hint.
No, it wouldn’t be easier to understand. In VB it would be thousands and thousands of lines of code. This is why some things are easier to do in one language (with its libraries) than in another.
It wouldn’t be thousands of lines of code…. it would be a DLL…. with one single method that would take 250 parameters and settings, only fail miserably after upgrading, downgrading and re-installing SQL server on a remote host.
Actually, this could be achieved in vb.net in about the same number of lines, using the opencv .net port emgu (google it), or one of the other excellent .net image and machine learning libs (aforge, accord etc).
I am not a vb.net fan, but it is in no way an underpowered lanuage, running on .NET as it does.
Hi,
I am interested to reproduce your results on my local machine, using your published Python script above. I wonder if it is possible to release the rest of the test images (and that code snippet above) to GitHub or some public link?
Great work, thanks.
Hi Lee,
I downloaded the demo images from here: http://www.minteye.com/products.aspx
I don’t have access to the files at the moment, so can’t give them to you, sorry.
There’s 2 spaces between ‘excellent’ and ‘as’: “The results are excellent_ _as you can see below.”
…and a 2 spaces after that sentence too.
That must be shocking! O_O
Great article btw =)
Pingback: Breaking the minteye captcha again | Daily IT News on it news..it news..
Pingback: Breaking the minteye captcha again
Pingback: New Robot Vs Human Validation: Slide To Fit Captcha
This is a quite clever technique. And I have to thank you for making me discover Sobel!
However, as Zhou suggested, this is not Visual Basic.
Very clever! I enjoyed reading this article.
If anyone’s having trouble getting opencv installed. This is the code above, but I’ve altered it to only need scipy/numpy/matplotlib.
(http://codepad.org/Qj2A0Dm0)
import sys
import os
import numpy as np
import scipy, scipy.misc
from scipy import ndimage
import matplotlib.pyplot as plt
if __name__ == ‘__main__’:
#for dir in range(1,14):
for dir in range(1,2):
dir = str(dir)
total_images = len(os.listdir(dir))+1
points_sob = []
for i in range(1,total_images):
im = scipy.misc.imread(dir+’/'+str(i)+’.jpg’)
im = im.astype(‘int32′)
dx = ndimage.sobel(im, 0) # horizontal derivative
dy = ndimage.sobel(im, 1) # vertical derivative
mag = np.hypot(dx, dy) # magnitude
mag *= 255.0 / np.max(mag) # normalize (Q&D)
points_sob.append(np.sum(mag))
x = range(1,total_images)
res = np.argmin(points_sob)+1
print res
plt.plot(res,points_sob[res-1], marker=’o', color=’r', ls=”)
plt.plot(x, points_sob)
plt.savefig(dir+’.png’)
plt.show()
Oops. Without the 2 image restriction;
http://codepad.org/8rosDZzT
Pingback: Breaking the MintEye image CAPTCHA in 23 lines of Python | jwandrews.co.uk » Quality and security of information systems
THIS IS NOT VISUAL BASIC!
Unbelievable that a company that made a captcha service their core business thinks this kind of captcha would be too hard to crack for a computer.
Nice, but please release the code in AMIGA BASIC. No-one uses Pythons to do codes.
Kthxbai
Pingback: Dear website owner, stop treating me like a spambot, it’s annoying | whisperax