The grading layer

How letters get
assigned and updated.

How every peptide letter is computed, when it gets re-evaluated, and who has to sign off. This is the operational doc — use it when a new paper, claim audit, safety signal, or regulatory event might change a published grade.

The full research-audit process that produces the underlying evidence lives on /methodology/research-protocol.

§ 01

The grading unit

Publicly we present grades as peptide × outcome because it reads cleanly. Internally the grading unit is narrower:

Internal grade key

One public grade, five editorial qualifiers

Behind each public label sits a tighter frame that tells us exactly what evidence belongs in the bucket and what does not.

FieldEditorial MeaningWhy It Matters

Peptide

Which molecule is actually under review

Not the entire category. One peptide, one evidentiary object.

Outcome

The specific promise being scored

Tendon healing, glucose control, sleep quality, fat loss, and so on.

Population

Who the evidence is really about

Healthy adults, IBD patients, older adults, rodent tendon-injury model.

Route

How the material was given

Oral, intranasal, subcutaneous, local injection. Route changes the story.

Time horizon

The window in which the claim is supposed to hold

Acute response, 12-week treatment, 6-month follow-up, long-term maintenance.

“BPC-157 for tendon healing” is not one evidentiary object if the underlying studies differ materially by population, route, or time window. Injectable BPC-157 in adults with musculoskeletal injury is not the same grade object as oral BPC-157 in a rodent gut-inflammation model.

Operational rules

  • Always score against an InternalGradeKey, even if the public page collapses to a simpler peptide × outcome label.
  • Only collapse multiple internal keys into one public row when evidence is directionally aligned and the collapse does not hide a weaker population, route, or time horizon behind a stronger one.
  • If route, population, or time horizon changes the evidentiary story, maintain separate internal records and state the qualifier on the public page.

Watch

Protocols are rated separately. See §16 — we deliberately do NOT borrow the peptide A–F letters for protocols.

§ 02

The letter grades

GradeLabelMeaning
AStrongMultiple independent high-quality controlled human trials converge in the same direction. Mechanism is well characterized. Longer-term safety is described in humans. Effect clearly exceeds placebo or comparator. A single positive Phase 2 trial is not enough.
BPromisingAt least one well-powered controlled human trial shows clinically meaningful benefit in the relevant internal grading unit. Mechanism is plausible to moderately established. Replication and/or longer-term safety remain limited.
CMixed / early signalSome direct human signal exists, but the evidence is incomplete, conflicting, underpowered, uncontrolled, or still leaning heavily on animal/mechanistic support.
DWeakEvidence is identifiable but weak: animal-only, mechanistic-only, anecdotal, or sparse uncontrolled human signal without convincing controlled confirmation.
FDisproven / unsafeDecisive human evidence shows no clinically meaningful effect, harm, or unacceptable risk for the studied use. Safety- or efficacy-driven medical regulatory rejection or withdrawal can also support F.
PendingBelow thresholdThe internal grading unit does not yet have enough peer-reviewed studies to produce a defensible letter. Editorial backfill is in progress.
InsufficientCannot yet be meaningfully graded — claim is underspecified, literature is effectively absent, or decisive sources cannot be verified. Use instead of D when we cannot honestly call the evidence weak because we cannot yet inspect the evidence set.

These thresholds are bands, not formulas. The letter is the editor’s defensible synthesis of the six sub-scores, the hard caps below, and the sign-off rules.

§ 03

Minimum evidence threshold · Pending status

No peptide × outcome receives a letter grade unless the underlying evidence set clears a minimum count of qualifying studies. Below the threshold, the outcome carries Pending until editorial backfill completes.

Note

A letter grade requires at least 3 peer-reviewed studies for the internal grading unit. Fewer than 3 → Pending. At 3+ → eligible for A–F per the rubric.

Rationale: with fewer than 3 studies you cannot assess consistency, and consistency — replication, directional agreement, or deliberate contradiction — is the core evidence-grading judgment. Two studies can agree by chance; three is the smallest count where a pattern becomes visible. Aligns with GRADE and Cochrane norms for when certainty assessment becomes meaningful.

What counts as a qualifying study

  • Peer-reviewed and published in an indexed journal, or a registered clinical trial with posted results. Preprints, conference abstracts without full publication, vendor white papers, and marketing collateral do not count.
  • About the relevant InternalGradeKey — the right peptide, outcome, population, route, and time horizon. A rodent oral study does not count toward a human injectable count.
  • Independent of other counted studies when assessing replication. Follow-on papers from the same lab on the same cohort count as one unless the design is materially different.
  • Not retracted or flagged with unresolved integrity concerns at the time of grading.

Systematic reviews and meta-analyses count as one study each toward the threshold but strengthen the grade through sub-score 02.

Additional quality gates on top of the count

  • A or B still require at least one well-powered controlled human trial. 50 animal studies and zero human trials cannot clear B.
  • F still requires decisive evidence of null effect, harm, or medical regulatory rejection, and must be supported by at least 2 studies showing null/harmful effect — a single trial cannot falsify.
  • Study-type weighting (RCT > cohort > case series > animal > in vitro) is handled by the sub-scores, not the threshold.

Pending vs Insufficient

StatusMeaningExpected outcome
PendingFewer than 3 qualifying studies exist and editorial backfill is in progress. Literature may or may not support a graded claim — we have not completed the review.Becomes a letter (or Insufficient) once backfill completes.
InsufficientEvidence set has been examined and the claim cannot be meaningfully graded — unit underspecified, literature structurally absent, or decisive sources unverifiable.Remains Insufficient until the underlying problem is resolved.

Operational rule: when the 3-study threshold is not met, default to Pending if the gap is expected to be fillable through ordinary literature search, and Insufficient if the claim itself is the problem (vague indication, unverifiable sourcing, absent primary literature by design).

Promotion out of Pending

  • Threshold met + rubric supports a letter → assign the letter, update lastUpdated, log a grade-history entry with fromGrade: “Pending”.
  • Threshold met but rubric caps low (e.g. qualifying studies are all animal-only for a human-outcome claim and sub-score 03 is capped at 1) → assign the letter grade the rubric supports; use Insufficient only if the grading unit itself is still unworkable.
  • Editorial review confirms literature is structurally absent → convert Pending → Insufficient with a short rationale.

Promotions out of Pending follow the same sign-off rules as any other grade assignment (see §12). The initial Pending → letter transition is treated as a one-letter change for sign-off purposes.

Why Pending and not a lower letter

The tempting shortcut is to call a 0–2-study outcome a D (“weak evidence”). The problem: D is a grading judgment, and grading requires the evidence set to be large enough to judge. Calling something D on 1 study claims more confidence in the negative than the data supports — it says “we looked and the evidence is weak,” when the honest statement is “we have not completed the review.” Pending forces that honesty.

§ 04

Bridge from claim audits to grade reassessment

Claim audits and peptide grades are related but not identical. A claim audit evaluates a public assertion using the research protocol. The grading layer only acts on structured outputs from that process, not on free-form prose.

Every claim audit that touches a peptide grade must emit one or more EvidenceConclusion objects:

Claim-audit handoff

What the grading team needs from an audit

Think of this as the handoff sheet between the claim-review process and the grading layer: exact target, exact verdict, exact implication.

FieldEditorial MeaningWhy It Matters

Target

Peptide, outcome, and internal grading unit

The handoff has to name the exact grade object being touched.

Sub-claim

The precise assertion that was tested

Not a vibe, not a paragraph. One checkable claim.

Status

Validated, contested, unvalidated, overstated, falsified, withdrawn, dependent, or speculative

This is the audit verdict on the claim itself.

Confidence

High, moderate, or low

How hard the audit is willing to lean on the conclusion.

Affected sub-scores

Mechanism, human studies, effect vs placebo, long-term safety, side effects, regulatory

Only the touched parts of the rubric should move first.

Grade impact

None, possible upgrade, possible downgrade, or mandatory reassessment

The audit can force reconsideration, but it never edits the letter directly.

Decisive evidence

Named study cards and confidence caps

The grade editor should be able to trace exactly what carried the decision.

Rationale

A short editorial explanation

Plain-language reasoning that survives outside the workflow.

Operational rules

  • A claim audit never changes a letter grade directly. It produces EvidenceConclusion records that may trigger reassessment.
  • OVERSTATED / FALSIFIED / WITHDRAWN normally imply gradeImpact = “mandatory_reassessment” when they touch a central sub-claim or decisive cited source.
  • CONTESTED implies mandatory_reassessment when comparable-quality direct support and contradiction exist for the same internal grading unit.
  • VALIDATED may justify possible_upgrade, but never auto-upgrades. The rubric still has to be re-run.
  • DEPENDENT / SPECULATIVE usually imply none, unless the peptide page is currently presenting them as direct support. In that case the grade may stay unchanged while the page wording is corrected.

§ 05

The six sub-scores

Every grade rolls up six weighted sub-scores, each rated 1–5 with a written justification visible on the peptide page.

#Sub-scoreWhat it measuresWeight
01Mechanism understoodDo we know how the molecule produces the claimed effect at a molecular and physiological level for the internal grading unit? "Plausible" is not the same as "demonstrated."High
02Human studiesHow many controlled studies in humans exist for the relevant population, route, and time horizon? RCT vs observational vs case report, plus sample size and power.Highest
03Effect vs placeboThe controlled human effect signal. Tracks placebo- or comparator-adjusted human outcomes, not animal sham comparisons standing in for human efficacy.Highest
04Long-term safetyWhat is the longest published human exposure and follow-up window for the internal grading unit? Is there post-marketing or registry surveillance?Medium
05Side effect profileObserved adverse events and tolerability, capped by how certain we are. A clean signal from tiny human exposure is not a well-characterized safety profile.Medium
06Regulatory statusMedical-review and safety-regulatory context for the studied use. Low-weight and must not conflate medical approval, safety warnings, legal availability, and sports prohibition.Low

The first three carry the most weight. Efficacy and human directness are what the grade is fundamentally about. Safety and regulatory context matter, but a perfectly safe molecule with no demonstrated effect still earns a low grade.

§ 06

Scoring scale per sub-score

ScoreGeneric interpretation
5Best-in-class evidence. Multiple high-quality, replicated, recent, directly relevant.
4Strong. One pivotal trial or substantial consistent data.
3Moderate. Reasonable evidence with notable gaps or limits.
2Weak. Sparse, indirect, low-quality, or poorly replicated evidence.
1Effectively absent, decisively negative, or too compromised to support the claim.

§ 07

Hard caps & edge-case rules

These rules are not optional. They exist to stop the most common grading inflation errors.

Sub-score 03 · Effect vs placebo

  • 03 ≥ 3 requires controlled human outcome data in the relevant InternalGradeKey.
  • 03 = 2 is the ceiling when some direct human outcome signal exists, but it is uncontrolled, retrospective, open-label, or otherwise not comparator-based.
  • 03 = 1 is the default when no direct controlled human efficacy evidence exists for the relevant grading unit.
  • Animal-versus-sham findings may strengthen sub-score 01 and narrative context, but do not raise sub-score 03 above what human evidence allows.

Sub-score 05 · Side effect profile

  • Cumulative direct human exposure under 50 people, or follow-up under 30 days → 05 cannot exceed 3.
  • No direct human exposure data, safety case mostly animal toxicology or mechanistic inference → 05 cannot exceed 2.
  • “No adverse events reported” in a tiny pilot is an early tolerability signal, not a mature safety profile.
  • Any credible serious adverse event signal, unresolved integrity problem, or major uncertainty about the administered material caps the score at 1–2 pending review.

Sub-score 06 · Regulatory status

Track these dimensions separately in page notes and internal data:

  • medicalApprovalStatus — approval, non-approval, or review status for the studied use
  • safetyWarningStatus — warning letters, withdrawals, safety alerts, refusals
  • availabilityStatus — compounding, legal-access, or supply-chain status
  • sportsProhibitedStatus — WADA or other sports-governing-body status
  • Sub-score 06 primarily reflects medical-review and safety-regulatory context.
  • WADA status is compliance information for athletes. It does not imply efficacy or clinical danger by itself.
  • Legal availability or compounding restrictions alone do not push a grade to F.
  • Only safety- or efficacy-driven medical regulatory action can materially support a downgrade toward F.

§ 08

From sub-scores to letter

There is no rigid formula. The editorial heuristic:

LetterRequirements
AAt least two independent high-quality controlled human trials, or one pivotal trial plus an independent confirmatory controlled study of comparable directness. Sub-scores 02 and 03 both ≥ 4, 01 ≥ 4, 04 and 05 both ≥ 3. A single positive Phase 2 trial cannot produce A.
BAt least one well-powered controlled human trial showing clinically meaningful benefit in the relevant grading unit. Sub-scores 02 and 03 both ≥ 3, 01 ≥ 3.
CDefault when some direct human signal exists but meaningful gaps remain, or when strong indirect evidence still carries too much of the case. Common C profile: 01 ≥ 3, 02 = 2–3, 03 = 1–2.
DEvidence is weak but identifiable — animal-only, mechanistic-only, anecdotal, or sparse uncontrolled human evidence. Requires ≥ 3 qualifying studies (not Pending).
FActively negative human evidence, unacceptable risk, or safety-/efficacy-driven medical regulatory rejection or withdrawal for the studied use. Sports prohibition or legal-access constraints alone are not enough.
InsufficientInternal grading unit cannot be specified cleanly, literature is effectively absent, or decisive sources cannot be verified.

Note

When in doubt, grade DOWN rather than up. Credibility is built on under-promising.

§ 09

Pending vs Insufficient vs D

This boundary must be applied consistently:

  1. D

    We looked, and what exists is weak

    At least 3 qualifying studies and the rubric supports a weak-evidence letter. Animal data, mechanistic papers, uncontrolled human case series, or anecdotal evidence that is inspectable and clearly limited.

  2. Pending

    We have not finished looking

    Evidence set has fewer than 3 qualifying studies and editorial backfill is reasonably expected to close the gap.

  3. Insufficient

    Evidence set is structurally unworkable

    Target cannot yet be meaningfully assessed — claim underspecified, route/population/time horizon unclear, literature effectively absent by design, or decisive sources unverifiable.

§ 10

Reassessment triggers

An internal peptide grade is re-evaluated when ANY of the following occurs:

TriggerDetectionSLA
New peer-reviewed RCT for that internal grade keyPubMed alerts on peptide name + outcome + route/population termsWithin 30 days
New systematic review or meta-analysisPubMed alertsWithin 30 days
Claim audit emits an EvidenceConclusion with gradeImpact ≠ "none"Claim-review workflow handoffWithin 7 days
Retraction of any cited paperRetraction Watch + manual quarterly checkSame day
Major medical regulatory action (approval, refusal, withdrawal, warning, safety-grounded compounding action)Regulatory bulletins + monthly checkSame day
Sports-prohibited status changeWADA or governing-body bulletinsWithin 30 days (immediate for athlete-safety copy)
PubPeer integrity flag on a cited paperPubPeer monitoringWithin 7 days
Need to split or narrow the internal grade key (route, population, or time horizon change)Editorial review or claim audit handoffWithin 14 days
Quarterly housekeeping audit (no specific trigger)CalendarAt least every 90 days
Reader-submitted evidence challenge via corrections@peptigrade.ioInboxAcknowledged in 7 days · resolved in 30

§ 11

The reassessment workflow

When a trigger fires, walk through this:

  1. § 01

    Confirm the trigger is in scope

    Read the new evidence, claim-audit handoff, retraction notice, or regulatory action. Confirm it concerns the peptide, outcome, population, route, and time horizon under consideration. A new oral rodent study does not automatically change an injectable human grade.

  2. § 02

    Import the EvidenceConclusion if the trigger came from a claim audit

    Do not translate prose by hand when a structured claim-audit output exists. Identify the sub-claim, affected internal grade key, touched sub-scores, and whether the outcome is wording-only, possible band move, or mandatory reassessment. If fields are missing, send it back for completion before changing the grade.

  3. § 03

    Re-score only the affected sub-scores, applying the hard caps

    Identify which sub-score(s) the new evidence actually touches. Re-score only those first. Apply the caps in §07 — do not let narrative enthusiasm override them. Write updated justifications in plain language. Examples: a placebo-controlled human RCT typically touches 02 and 03; longer follow-up touches 04 and maybe 05; a safety warning affects 05 and 06.

  4. § 04

    Re-roll the letter grade

    Apply the heuristic in §08. The letter may change up, down, or stay the same. A single positive Phase 2 trial can plausibly move C → B if well-powered and clinically meaningful; it does not move C → A on its own. If the internal grade key needs to be split by route/population/time horizon first, do that, then re-roll each resulting grade separately.

  5. § 05

    Determine whether the publication packet changes

    Four common outcomes: wording correction only (update copy, citations, lastReviewed; no grade-history entry); grade unchanged with updated sub-scores (update notes and lastReviewed, no editorial note required); grade up or down by one letter (add Grade history entry, editor + second-editor review required); grade change of two+ letters or any move to/from A or F (Grade history entry, full editorial-board review, publish a CHANGE NOTE).

  6. § 06

    Update related artifacts

    Grade changes ripple. Update the peptide's topGrade if this was the top outcome and the grade moved. Update any /protocols/[slug] page that includes the peptide as a component. Update any /claims/[slug] page that depends on the affected sub-claim. Update the home-page carousel and featured-peptides section if featured. Regenerate sitemap.xml on the next build.

  7. § 07

    Editorial sign-off

    Wording correction only → author editor. No grade change with sub-score updates → author editor. One-letter change → author editor + one second editor. Two+ letter change or any move to/from A or F → author editor + second editor + clinical advisor. Retraction-, integrity-, or safety-driven change → same chain plus same-day publication once verified.

§ 12

Ownership & authority

The grading layer needs clear authority boundaries:

DecisionOwner
Routine sub-score refresh with no letter changePrimary evidence editor for that peptide
Any letter-grade movementPrimary evidence editor proposes · second editor approves
Any move to/from A or F, or any safety-/integrity-driven downgradePrimary evidence editor + second editor + clinical advisor
Override of a claim-audit arbitrator, or override of gradeImpact = "mandatory_reassessment"Editorial lead + clinical advisor · rationale logged in Grade history
Publication of a medical- or safety-related grade changeEditorial lead owns release and timing
Quarterly stale-review sweepManaging editor or designated evidence-ops owner

Note

No single person should author the triggering claim audit, arbitrate the claim audit, and approve the resulting grade change alone.

§ 13

Versioning & audit trail

Every grade change must leave a trail. The pattern:

Grade history entry

What a clean change log should capture

Not software for software's sake. Just the minimum record needed so a future reader can understand what changed, why it changed, and who approved it.

FieldEditorial MeaningWhy It Matters

When

The date the grade changed

An ISO-stamped moment in the public record.

What moved

The exact internal grading unit

Not just the peptide page broadly, but the specific row that changed.

From → to

The prior letter and the new letter

Readers should be able to see direction, not just the latest state.

Why now

The trigger: paper, regulatory event, or claim-audit run

Every shift needs a traceable cause.

What changed underneath

The affected sub-scores and linked audit outputs

The movement should be reconstructible, not just asserted.

Editorial rationale

A one- or two-sentence explanation of the move

Short enough to scan, precise enough to defend.

Approval trail

Who signed off

Especially important for safety-led downgrades or moves into A or F.

We do not yet store this in the codebase — currently each peptide outcome has only lastUpdated. Next step: extend the OutcomeGrade type with an optional history: GradeHistoryEntry[] field, render the history on the peptide page as a small Grade history panel under the grade matrix, and surface it in the Drug JSON-LD.

§ 14

Errata vs reassessment vs claim audit

Three distinct operations, three different workflows:

OperationWhenWorkflow
ErrataA factual error is found on a published page: wrong PMID, dose number, author name, label text.Fix the error, append a dated correction note, mention in the next weekly dispatch. No grade change. Owner: any editor.
ReassessmentNew evidence or a structured claim-audit handoff triggers re-evaluation of a published internal peptide grade.This document, Steps 1–7. Owner: editor responsible for the peptide.
Claim auditA popular framing or specific assertion needs adjudication.New /claims/[slug] page following the research protocol. If grade-relevant, it emits EvidenceConclusion objects. Owner: editor + clinical advisor for the relevant peptide.

§ 15

What never raises a grade

These never count as supporting evidence regardless of how persuasive they sound:

  • Influencer testimonials
  • Vendor marketing copy
  • Forum or social-media anecdotes
  • Vendor-funded white papers without peer review
  • “It worked for me” reports
  • Mechanistic plausibility absent direct evidence
  • Review articles cited as if they were decisive primary efficacy evidence

These can be cited as context but they cannot move sub-scores 02, 03, 04, or 05 in a positive direction.

§ 16

Protocol-level evidence labels

A multi-compound protocol is not a peptide × outcome pair. Borrowing the A–F peptide letters for a protocol would imply direct regimen-level human-outcome evidence that almost never exists.

Protocols use a distinct label set (see /protocols):

LabelMeaning
Exploratory synthesisCombination of research-backed hypotheses; the regimen as assembled has not been tested as a unit in a controlled human trial.
Mechanistically plausibleComponents act on characterized pathways; protocol-level human outcome evidence is absent.
Emerging clinical evidenceSome protocol-level human outcome data exists, but it is under-powered or preliminary.
Established protocolValidated protocol-level human RCT evidence.

Watch

Compound-level peptide grades and the protocol evidence label are scored separately. A protocol may contain one C-grade peptide, two D-grade peptides, and still carry an “Exploratory synthesis” label because the full stack has not been tested together. Do not aggregate letter grades across compounds into a protocol “top grade.”

§ 17

Worked example: BPC-157 × tendon healing

Public label: BPC-157 × tendon healing

Internal grade key, simplified: BPC-157 × tendon healing × adults with musculoskeletal injury × injectable/systemic exposure × acute/subacute healing window

Current state: Grade C, sub-scores 4 / 2 / 1 / 2 / 2 / 1, top outcome.

Sub-scoreValueJustification
01 Mechanism4FAK-paxillin, VEGFR2-Akt-eNOS, and GH-receptor-related pathways are characterized in animal and in-vitro systems, but human PK and a confirmed receptor story remain absent.
02 Human studies2Sparse direct human data exists, but no well-powered controlled human trial for this grading unit.
03 Effect vs placebo1No placebo- or comparator-controlled human efficacy trial exists for the relevant grading unit. Animal-versus-sham consistency helps context, not this score.
04 Long-term safety2Human exposure windows are short, with no robust longer-term follow-up.
05 Side effect profile2Small human exposure and short follow-up mean the observed tolerability signal is still low-certainty, even if animal toxicology looks favorable.
06 Regulatory status1Medical approval is absent. Safety/access/sports notes are tracked separately, and WADA status does not by itself imply inefficacy or danger.

Why C and not B

The B rubric requires at least one well-powered controlled human trial plus sub-scores 02 and 03 both ≥ 3. BPC-157 × tendon healing does not meet that bar. Strong and replicated animal evidence plus sparse uncontrolled human signal is a canonical C, not a B.

What would move it to B

A positive, well-powered controlled Phase 2 or Phase 3 human trial showing clinically meaningful benefit in the relevant grading unit could plausibly move 02 to 3–4 and 03 to 3, producing a likely C → B reassessment if safety remains acceptable.

What would move it to A

At least one additional independent high-quality controlled human trial pointing the same way, plus stronger longer-term human safety data. A single positive Phase 2 trial would not be enough.

What would move it down

A decisive high-quality null RCT, a credible serious safety signal, or a safety-/efficacy-driven medical regulatory action could force reassessment toward D or F depending on how conclusive the new evidence is.