Fighting Image SPAM: FuzzyOCR Resources
30 Jan
I posted recently regarding the battle against image SPAM. One of the comments pointed out that defeating image SPAM was actually fairly straightforward if OCR software was used to scanned image attachments. I agree, but the real problem is that very few people have access to an Optical Character Recognition (OCR) based scanning solution. Either you pay the BIG bucks for a commerical SPAM filtering solution, or you get an uber-geek to integrate FuzzyOCR into your existing Spamassassin based solution. I would guess that 95%+ of email inboxes are not currently protected by OCR so it appears the image spammers aren’t yet too concerned about fooling OCR-based anti-SPAM solutions.
I admit that I haven’t gotten around to implementing FuzzyOCR yet. I guess I just assumed (wrongly?) that the spammers would test their messages against FuzzyOCR before sending and that would make it difficult to catch them. Apparently that is not necessarily the case so I started searching around the web for resources on implementing FuzzyOCR. My preferred platform is FreeBSD, but I also run several Ubuntu servers so that is an option as well. Here’s what I came up with:
- FuzzyOCR Overview
- FuzzyOCR Installation Instructions
- FuzzyOCR Operating System Specific Notes
- A great FreeBSD Tutorial
Tags: anti-SPAM, fuzzyOCR, ocr-software, spam, spamassassin
No comments yet