A Critical Guide to InterPro

This Critical Guide in the Introduction to Bioinformatics series provides an introduction to the InterPro database, the largest, most comprehensive, integrated protein family database in the world. The rationale for creating the resource, the nature of its contributing databases and the kinds of information they provide are discussed, and the role of InterPro in protein classification and function-annotation projects is outlined.

Specifically, this Guide introduces the principal components of the InterPro database, the differences between them, and how their integration creates a resource whose diagnostic power is greater than the sum of its parts. On reading this Guide, users will be able to: i) explain how protein family databases are used to help annotate uncharacterised protein sequences; ii) identify InterPro’s constituent data resources and explain the main methods that underpin them; iii) search InterPro using keywords and full sequences; iv) analyse and interpret search results in terms of protein family hierarchies, their structural domains and functional features; and v) track the provenance of InterPro’s annotations.