Friday, February 29, 2008

Fingerprints for recommendations

A number of audio fingerprint technologies exist today and are used for things like radio monitoring, copyright filtering, and content identification (e.g. Shazam). To over simplify, these technologies work by generating a fingerprint from acoustic information of a track, and then determining whether there is a fingerprint of known content that is similar enough within a confidence threshold that the track can be reliably identified. This is probably already being done (I haven't read up on the literature) but it seems to me that you might end up with a half-decent recommendation engine if you ignore the high confidence hits and instead look at the medium-low confidence hits (aka near misses). Because recommendation engines can be pre-generated and speed is not a problem, you could only pair songs that had multiple "near misses" at different points in time.

No comments: