Home > Support > FAQs

Comprehensive Protein Database Resources: Explore UniProt for Research Excellence

Release date: 2025-02-26 View count: 56

Introduction: UniProt Database is your ultimate resource for comprehensive protein information. Developed collaboratively by the European Bioinformatics Institute (EMBL-EBI), the Swiss Institute of Bioinformatics (SIB) at the University of Zurich, and the National Center for Biotechnology Information (NCBI), its mission is to integrate, annotate, and provide detailed protein sequence and functional data.

Main Sections of UniProt:

UniProtKB: The core component of UniProt, divided into two parts:

Reviewed (Swiss-Prot): Expertly curated entries featuring high-quality protein data, including detailed function descriptions, domain structures, variant information, and literature references.

Unreviewed (TrEMBL): Automatically annotated protein sequences sourced from international nucleotide databases (GenBank/DDBJ/EMBL), pending manual review.

Proteomes: This section presents the complete set of proteins for organisms with whole-genome sequencing data. It provides a comprehensive proteomic map by cataloging and annotating protein products from all coding genes.

UniRef: A clustering database that groups similar protein sequences to create representative sets, improving data retrieval efficiency. It is subdivided into UniRef100, UniRef90, and UniRef50, corresponding to sequence identity thresholds of approximately 97%, 90%, and 50% respectively.

UniParc: A comprehensive repository consolidating all protein sequences from various sources—such as UniProtKB, PIR, PRF, and NCBI RefSeq—to ensure each unique sequence is stored only once, thereby eliminating redundancy.

Practical Example: Using UniProt with Human IL-6

To search for human IL-6:

Keyword Search: Enter the protein name, ID, organism, or functional description in the search bar.

Advanced Search: Click the “Advanced” link to construct complex queries using logical operators (AND, OR, NOT) and specific fields (e.g., gene, protein name, organism) for precise results.

1. Visit the homepage at www.uniprot.org, enter “IL-6” in the search bar, and click Search or press Enter. Then select “Human” from the left sidebar.

2. Locate your desired protein entry and click to view detailed information, including its unique ID (Entry), Entry Name, Protein Names, Gene Names, Organism, and protein length.

Key Information Provided by UniProt:

  • Function – Detailed annotations of protein functions.
  • Names & Taxonomy – Protein and gene names, synonyms, and source organism information.
  • Subcellular Location – Cellular localization details of the mature protein.
  • Disease & Variants/Phenotypes & Variants – Disease associations in human entries and phenotypic data for non-human entries, including the impact of amino acid variants.
  • Expression – Data on mRNA or protein expression levels across tissues or cells.
  • PTM/Processing – Information on post-translational modifications and protein processing.
  • Interaction – Details on protein interactions and quaternary structures.
  • Structure – Tertiary structure data, including experimental findings or AlphaFold predictions.
  • Family & Domains – Insights into sequence similarity and identified domains.
  • Sequence – Canonical protein sequence with related metrics like length and molecular weight.
  • Similar Proteins – Links to UniRef clusters for related sequences.

 

Essential Tools Offered by UniProt:

BLAST: Optimized for comparing query sequences against curated protein data, helping to identify similar sequences, predict functions, analyze evolutionary relationships, and support structural modeling.

Align: Align multiple sequences to identify conserved regions that suggest shared functions, structures, or evolutionary relationships.

Search with List Map IDs: Submit a list of identifiers to quickly retrieve corresponding UniProt entries. Each entry provides direct links to external resources such as GenBank, PubMed, KEGG, and GO.

Search Peptides: Enter short peptide sequences (minimum of three residues) to find all matching UniProtKB sequences. This tool is especially useful for precise peptide mapping and validation.

Conclusion: By integrating detailed annotations on protein function, taxonomy, subcellular localization, interactions, structure, and more, UniProt has established itself as an indispensable bioinformatics resource. Whether you are investigating protein evolution, structure, or post-translational modifications, UniProt offers user-friendly tools and comprehensive data to support your research.

SEO Keywords: UniProt database, UniProtKB, Proteomes, UniRef, UniParc, protein sequences, protein function, BLAST, protein alignment, IL-6, protein annotation, post-translational modifications, protein structure, protein domains, bioinformatics resources, protein evolution.

Get a free quote