Introduction: UniProt Database is your ultimate resource for comprehensive protein information. Developed collaboratively by the European Bioinformatics Institute (EMBL-EBI), the Swiss Institute of Bioinformatics (SIB) at the University of Zurich, and the National Center for Biotechnology Information (NCBI), its mission is to integrate, annotate, and provide detailed protein sequence and functional data.
Main Sections of UniProt:
UniProtKB: The core component of UniProt, divided into two parts:
– Reviewed (Swiss-Prot): Expertly curated entries featuring high-quality protein data, including detailed function descriptions, domain structures, variant information, and literature references.
– Unreviewed (TrEMBL): Automatically annotated protein sequences sourced from international nucleotide databases (GenBank/DDBJ/EMBL), pending manual review.
Proteomes: This section presents the complete set of proteins for organisms with whole-genome sequencing data. It provides a comprehensive proteomic map by cataloging and annotating protein products from all coding genes.
UniRef: A clustering database that groups similar protein sequences to create representative sets, improving data retrieval efficiency. It is subdivided into UniRef100, UniRef90, and UniRef50, corresponding to sequence identity thresholds of approximately 97%, 90%, and 50% respectively.
UniParc: A comprehensive repository consolidating all protein sequences from various sources—such as UniProtKB, PIR, PRF, and NCBI RefSeq—to ensure each unique sequence is stored only once, thereby eliminating redundancy.
Practical Example: Using UniProt with Human IL-6
To search for human IL-6:
Keyword Search: Enter the protein name, ID, organism, or functional description in the search bar.
Advanced Search: Click the “Advanced” link to construct complex queries using logical operators (AND, OR, NOT) and specific fields (e.g., gene, protein name, organism) for precise results.
1. Visit the homepage at www.uniprot.org, enter “IL-6” in the search bar, and click Search or press Enter. Then select “Human” from the left sidebar.
2. Locate your desired protein entry and click to view detailed information, including its unique ID (Entry), Entry Name, Protein Names, Gene Names, Organism, and protein length.
Key Information Provided by UniProt:
Essential Tools Offered by UniProt:
BLAST: Optimized for comparing query sequences against curated protein data, helping to identify similar sequences, predict functions, analyze evolutionary relationships, and support structural modeling.
Align: Align multiple sequences to identify conserved regions that suggest shared functions, structures, or evolutionary relationships.
Search with List Map IDs: Submit a list of identifiers to quickly retrieve corresponding UniProt entries. Each entry provides direct links to external resources such as GenBank, PubMed, KEGG, and GO.
Search Peptides: Enter short peptide sequences (minimum of three residues) to find all matching UniProtKB sequences. This tool is especially useful for precise peptide mapping and validation.
Conclusion: By integrating detailed annotations on protein function, taxonomy, subcellular localization, interactions, structure, and more, UniProt has established itself as an indispensable bioinformatics resource. Whether you are investigating protein evolution, structure, or post-translational modifications, UniProt offers user-friendly tools and comprehensive data to support your research.
SEO Keywords: UniProt database, UniProtKB, Proteomes, UniRef, UniParc, protein sequences, protein function, BLAST, protein alignment, IL-6, protein annotation, post-translational modifications, protein structure, protein domains, bioinformatics resources, protein evolution.
+33(0)3 90 20 54 70
19 rue de la Haye 67300 Schiltigheim France