HSA Human Sheddome Atlas

About the Human Sheddome Atlas

The Human Sheddome Atlas (HSA) is a predicted atlas of shed ectodomains — protein fragments released from the cell surface when extracellular regions of membrane proteins are cleaved. Each ectodomain is scored for structural coherence (does it contain full, intact InterPro domains?) and experimental support (do known proteases cleave near the bounding residues?). Cleavage evidence is harmonised from multiple complementary sources and kept alongside the predictions so every call can be inspected in context.

What you'll find here

  • 26,838 predicted shed ectodomains (≥ 30 aa, signal / transit residues excluded), of which 11,371 fully contain at least one structural domain
  • 5,211 human membrane proteins with per-residue cleavage probability profiles at three stringency levels (stringent / moderate / permissive)
  • Topology per residue (signal peptide, propeptide, transit peptide, extracellular, transmembrane, intracellular) and AlphaFold-predicted 3D structure
  • Domain annotations from seven sources: UniProt features, Pfam, SMART, CDD, Gene3D, SUPERFAMILY and TED (CATH)
  • Experimental cleavage sites harmonised across 110 canonical proteases

How to read an entry

  • The topology backdrop shows the per-residue topology colour beneath the cleavage probability curve.
  • Predicted peak = a local maximum on the cleavage probability curve, after clustering centres closer than 5 aa.
  • IP0.9 window = 90 % inflection-point boundary of the peak — a tight functional interval around the predicted cut.
  • Cut-seq = P4–P4' octamer around the peak centre (P1-indexed).
  • Experimental diamonds = reported cleavage positions from the integrated databases.

Data & methods

Predictions are derived from a sequence-based probability model; peaks are called with scipy.signal.find_peaks at three prominence thresholds (0.01 / 0.03 / 0.05). IP0.9 boundaries come from the same peak-detection pipeline.

A shed ectodomain is a segment of a membrane protein bounded on both sides by predicted cleavage peaks (or by a free terminus, for N-/C-terminal windows) whose endpoints pass an ectodomain-topology filter (outside-facing, and not within 10 aa of a signal/transit peptide). For domain-database scoring, we additionally require the segment to fully contain at least one annotated domain from a given source, and each bounding peak's 90 % inflection-point window to overlap no domain in that protein by more than 15 % of its length — i.e. peaks may nibble a domain edge but must not engulf meaningful portions of any structural unit.

Experimental cleavage annotations are harmonised across complementary sources:

  • UniProt — signal peptidase, propeptide processing, and curated protease notes
  • MEROPS — peptidase–substrate literature
  • TopFIND — cleavage annotations from proteomic studies
  • Schaeffer 2022 — cell-surface N-terminomics
  • Weeks 2021 — subtiligase TM N-terminomics
  • SheddomeDB — previously curated shedding-substrate catalogue
  • HPRD — legacy human proteome reference data
  • PMAP — Protease MAP PMAP.CutDB annotations
  • Primary papers — individually-curated single-study cleavage reports

Each entry retains its original citations; rows from multiple sources that describe the same (protein, position) event are deduplicated for summary statistics but kept separately in the per-protein detail view.

Team

Affiliations

Contact & citing

For questions, collaborations, or bulk-data access, email [email protected].

Manuscript in preparation. Please cite the website URL (sheddome.secretomeatlas.org) until a preprint is available.