Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Image recognition books and resources
13 points by matt1 on Oct 3, 2009 | hide | past | favorite | 11 comments
I'm interested in learning some basic image recognition and analysis techniques. The ideal resource would be one that starts from scratch and includes lots of examples and code.

There are several expensive books on Amazon and the web turns up plenty of technical papers, but I'm not sure where to begin.

Any recommendations?



Here's my list:

For a general introduction:

* "Digital Image Processing" by Gonzalez and Woods

* "Image Processing, Analysis, and Machine Vision" by Sonka, Hlavac, and Boyle

After you learn the basics, some more advanced and specialized books I'd recommend:

* "Morphological Image Analysis" by Soille.

* "Insight into Images" by Yoo (explains many of the algorithms in the ITK library).

I hesitate to recommend this book because it is so dated (1992), but I refer to "Computer and Robot Vision" by Haralick and Shapiro quite often.


Image recognition in what domain? Do you want to be able to look at a photo and determine what objects are represented, or do you want to be able to do OCR or do you want to filter biometric data or something else entirely?


Good question --

I'm interested in detecting objects in images. For example, whether or not a car is present in a photo.


For that I suggest "Feature Extraction and Image Processing" 2.ed by Mark Nixon and Alberto S Aguado. But "Digital Image Processing" (DIP) by Gonzalez and Woods as suggested by monk_the_dog is pretty much the standard prereq. Actually DIP may get you what you need depending on how deep you need to go.


Starting with some basic knowledge of machine learning (clustering, NN, bayesian inference, etc.) and some basic computer vision / processing (edge detection, color, basic shapes), how much theory is needed for achieving that objective? (recognizing vehicles in photos, and more interesting objectives: extracting 3d structure from a single 2d image).


"How much theory for recognizing objects in images?": Some pattern recognition, lots of image processing.

For the most part, it doesn't matter what classifier you use: k nearest neighbor,support vector machine, random forest, neural nets. They'll all give about the same performance. You should have a general idea what they do, but I don't think it's worth the effort to become a "neural net expert". You should know enough pattern recognition so you don't fool yourself (by over-training, for example), and have an idea for how to choose the right features.

Where should you put your effort? Into finding useful features for the object you want to classify. And the more image processing you know the more useful features you'll be able to try. How much do you need to know? Depends on the problem. If you're finding cars in the desert then not so much. Your feature set might be "has long straight lines and is not sand colored". If you're trying to tell American made cars from Japanese then it's harder (unless they are moving, in which case it can't be American).


Thanks!! Very interesting explanation!

And what is needed for face recognition (face matching, not just detection), in the same terms? Are the same kind of tools enough for this? (from what I read, it seems so, but so far I couldn't completely believe it)


Feature Extraction and Image Processing looks pretty good. Do you think a beginner would be able to follow it without reading Digital Image Processing first?


That's a tough call. It's hard to tell if I can understand the concepts from that along after already read and studied DIP. You may want to pick up the 2.ed of DIP if your concern is price. I learned from 2.ed of DIP, and not sure if 3.ed is worth $60+ more.


I was in your boat several months ago when I had to learn image processing from scratch to implement a requested feature

"Digital Image Processing Using MATLAB" ended up the most useful, along with some basic Matlab tutorials

here's what I ended up with http://yaroslavvb.blogspot.com/2009/08/robust-ocr-in-video.h...


There is a draft of Computer Vision: Algorithms and Applications by Richard Szeliski available for free online http://research.microsoft.com/en-us/um/people/szeliski/Book/. It's good overall, but some sections are still incomplete (missing diagrams, for example).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: