The grading layer

How letters get
assigned and updated.

How every peptide letter is computed, when it gets re-evaluated, and who has to sign off. This is the operational doc — use it when a new paper, claim audit, safety signal, or regulatory event might change a published grade.

The full research-audit process that produces the underlying evidence lives on /methodology/research-protocol.

§ 01

The grading unit

Publicly we present grades as peptide × outcome because it reads cleanly. Internally the grading unit is narrower:

Internal grade key

One public grade, five editorial qualifiers

Behind each public label sits a tighter frame that tells us exactly what evidence belongs in the bucket and what does not.

Field	Editorial Meaning	Why It Matters
Peptide	Which molecule is actually under review	Not the entire category. One peptide, one evidentiary object.
Outcome	The specific promise being scored	Tendon healing, glucose control, sleep quality, fat loss, and so on.
Population	Who the evidence is really about	Healthy adults, IBD patients, older adults, rodent tendon-injury model.
Route	How the material was given	Oral, intranasal, subcutaneous, local injection. Route changes the story.
Time horizon	The window in which the claim is supposed to hold	Acute response, 12-week treatment, 6-month follow-up, long-term maintenance.

“BPC-157 for tendon healing” is not one evidentiary object if the underlying studies differ materially by population, route, or time window. Injectable BPC-157 in adults with musculoskeletal injury is not the same grade object as oral BPC-157 in a rodent gut-inflammation model.

Operational rules

Always score against an InternalGradeKey, even if the public page collapses to a simpler peptide × outcome label.
Only collapse multiple internal keys into one public row when evidence is directionally aligned and the collapse does not hide a weaker population, route, or time horizon behind a stronger one.
If route, population, or time horizon changes the evidentiary story, maintain separate internal records and state the qualifier on the public page.

Watch

Protocols are rated separately. See §16 — we deliberately do NOT borrow the peptide A–F letters for protocols.

§ 02

The letter grades

Grade	Label	Meaning
A	Strong	Multiple independent high-quality controlled human trials converge in the same direction. Mechanism is well characterized. Longer-term safety is described in humans. Effect clearly exceeds placebo or comparator. A single positive Phase 2 trial is not enough.
B	Promising	At least one well-powered controlled human trial shows clinically meaningful benefit in the relevant internal grading unit. Mechanism is plausible to moderately established. Replication and/or longer-term safety remain limited.
C	Mixed / early signal	Some direct human signal exists, but the evidence is incomplete, conflicting, underpowered, uncontrolled, or still leaning heavily on animal/mechanistic support.
D	Weak	Evidence is identifiable but weak: animal-only, mechanistic-only, anecdotal, or sparse uncontrolled human signal without convincing controlled confirmation.
F	Disproven / unsafe	Decisive human evidence shows no clinically meaningful effect, harm, or unacceptable risk for the studied use. Safety- or efficacy-driven medical regulatory rejection or withdrawal can also support F.
Pending	Below threshold	The internal grading unit does not yet have enough peer-reviewed studies to produce a defensible letter. Editorial backfill is in progress.
Insufficient	—	Cannot yet be meaningfully graded — claim is underspecified, literature is effectively absent, or decisive sources cannot be verified. Use instead of D when we cannot honestly call the evidence weak because we cannot yet inspect the evidence set.

These thresholds are bands, not formulas. The letter is the editor’s defensible synthesis of the six sub-scores, the hard caps below, and the sign-off rules.

§ 03

Minimum evidence threshold · Pending status

No peptide × outcome receives a letter grade unless the underlying evidence set clears a minimum count of qualifying studies. Below the threshold, the outcome carries Pending until editorial backfill completes.

Note

A letter grade requires at least 3 peer-reviewed studies for the internal grading unit. Fewer than 3 → Pending. At 3+ → eligible for A–F per the rubric.

Rationale: with fewer than 3 studies you cannot assess consistency, and consistency — replication, directional agreement, or deliberate contradiction — is the core evidence-grading judgment. Two studies can agree by chance; three is the smallest count where a pattern becomes visible. Aligns with GRADE and Cochrane norms for when certainty assessment becomes meaningful.

What counts as a qualifying study

Peer-reviewed and published in an indexed journal, or a registered clinical trial with posted results. Preprints, conference abstracts without full publication, vendor white papers, and marketing collateral do not count.
About the relevant InternalGradeKey — the right peptide, outcome, population, route, and time horizon. A rodent oral study does not count toward a human injectable count.
Independent of other counted studies when assessing replication. Follow-on papers from the same lab on the same cohort count as one unless the design is materially different.
Not retracted or flagged with unresolved integrity concerns at the time of grading.

Systematic reviews and meta-analyses count as one study each toward the threshold but strengthen the grade through sub-score 02.

Additional quality gates on top of the count

A or B still require at least one well-powered controlled human trial. 50 animal studies and zero human trials cannot clear B.
F still requires decisive evidence of null effect, harm, or medical regulatory rejection, and must be supported by at least 2 studies showing null/harmful effect — a single trial cannot falsify.
Study-type weighting (RCT > cohort > case series > animal > in vitro) is handled by the sub-scores, not the threshold.

Pending vs Insufficient

Status	Meaning	Expected outcome
Pending	Fewer than 3 qualifying studies exist and editorial backfill is in progress. Literature may or may not support a graded claim — we have not completed the review.	Becomes a letter (or Insufficient) once backfill completes.
Insufficient	Evidence set has been examined and the claim cannot be meaningfully graded — unit underspecified, literature structurally absent, or decisive sources unverifiable.	Remains Insufficient until the underlying problem is resolved.

Operational rule: when the 3-study threshold is not met, default to Pending if the gap is expected to be fillable through ordinary literature search, and Insufficient if the claim itself is the problem (vague indication, unverifiable sourcing, absent primary literature by design).

Promotion out of Pending

Threshold met + rubric supports a letter → assign the letter, update lastUpdated, log a grade-history entry with fromGrade: “Pending”.
Threshold met but rubric caps low (e.g. qualifying studies are all animal-only for a human-outcome claim and sub-score 03 is capped at 1) → assign the letter grade the rubric supports; use Insufficient only if the grading unit itself is still unworkable.
Editorial review confirms literature is structurally absent → convert Pending → Insufficient with a short rationale.

Promotions out of Pending follow the same sign-off rules as any other grade assignment (see §12). The initial Pending → letter transition is treated as a one-letter change for sign-off purposes.

Why Pending and not a lower letter

The tempting shortcut is to call a 0–2-study outcome a D (“weak evidence”). The problem: D is a grading judgment, and grading requires the evidence set to be large enough to judge. Calling something D on 1 study claims more confidence in the negative than the data supports — it says “we looked and the evidence is weak,” when the honest statement is “we have not completed the review.” Pending forces that honesty.

§ 04

Bridge from claim audits to grade reassessment

Claim audits and peptide grades are related but not identical. A claim audit evaluates a public assertion using the research protocol. The grading layer only acts on structured outputs from that process, not on free-form prose.

Every claim audit that touches a peptide grade must emit one or more EvidenceConclusion objects:

Claim-audit handoff

What the grading team needs from an audit

Think of this as the handoff sheet between the claim-review process and the grading layer: exact target, exact verdict, exact implication.

Field	Editorial Meaning	Why It Matters
Target	Peptide, outcome, and internal grading unit	The handoff has to name the exact grade object being touched.
Sub-claim	The precise assertion that was tested	Not a vibe, not a paragraph. One checkable claim.
Status	Validated, contested, unvalidated, overstated, falsified, withdrawn, dependent, or speculative	This is the audit verdict on the claim itself.
Confidence	High, moderate, or low	How hard the audit is willing to lean on the conclusion.
Affected sub-scores	Mechanism, human studies, effect vs placebo, long-term safety, side effects, regulatory	Only the touched parts of the rubric should move first.
Grade impact	None, possible upgrade, possible downgrade, or mandatory reassessment	The audit can force reconsideration, but it never edits the letter directly.
Decisive evidence	Named study cards and confidence caps	The grade editor should be able to trace exactly what carried the decision.
Rationale	A short editorial explanation	Plain-language reasoning that survives outside the workflow.

Operational rules

A claim audit never changes a letter grade directly. It produces EvidenceConclusion records that may trigger reassessment.
OVERSTATED / FALSIFIED / WITHDRAWN normally imply gradeImpact = “mandatory_reassessment” when they touch a central sub-claim or decisive cited source.
CONTESTED implies mandatory_reassessment when comparable-quality direct support and contradiction exist for the same internal grading unit.
VALIDATED may justify possible_upgrade, but never auto-upgrades. The rubric still has to be re-run.
DEPENDENT / SPECULATIVE usually imply none, unless the peptide page is currently presenting them as direct support. In that case the grade may stay unchanged while the page wording is corrected.

§ 05

The six sub-scores

Every grade rolls up six weighted sub-scores, each rated 1–5 with a written justification visible on the peptide page.

#	Sub-score	What it measures	Weight
01	Mechanism understood	Do we know how the molecule produces the claimed effect at a molecular and physiological level for the internal grading unit? "Plausible" is not the same as "demonstrated."	High
02	Human studies	How many controlled studies in humans exist for the relevant population, route, and time horizon? RCT vs observational vs case report, plus sample size and power.	Highest
03	Effect vs placebo	The controlled human effect signal. Tracks placebo- or comparator-adjusted human outcomes, not animal sham comparisons standing in for human efficacy.	Highest
04	Long-term safety	What is the longest published human exposure and follow-up window for the internal grading unit? Is there post-marketing or registry surveillance?	Medium
05	Side effect profile	Observed adverse events and tolerability, capped by how certain we are. A clean signal from tiny human exposure is not a well-characterized safety profile.	Medium
06	Regulatory status	Medical-review and safety-regulatory context for the studied use. Low-weight and must not conflate medical approval, safety warnings, legal availability, and sports prohibition.	Low

The first three carry the most weight. Efficacy and human directness are what the grade is fundamentally about. Safety and regulatory context matter, but a perfectly safe molecule with no demonstrated effect still earns a low grade.

§ 06

Scoring scale per sub-score

Score	Generic interpretation
5	Best-in-class evidence. Multiple high-quality, replicated, recent, directly relevant.
4	Strong. One pivotal trial or substantial consistent data.
3	Moderate. Reasonable evidence with notable gaps or limits.
2	Weak. Sparse, indirect, low-quality, or poorly replicated evidence.
1	Effectively absent, decisively negative, or too compromised to support the claim.

§ 07

Hard caps & edge-case rules

These rules are not optional. They exist to stop the most common grading inflation errors.

Sub-score 03 · Effect vs placebo

03 ≥ 3 requires controlled human outcome data in the relevant InternalGradeKey.
03 = 2 is the ceiling when some direct human outcome signal exists, but it is uncontrolled, retrospective, open-label, or otherwise not comparator-based.
03 = 1 is the default when no direct controlled human efficacy evidence exists for the relevant grading unit.
Animal-versus-sham findings may strengthen sub-score 01 and narrative context, but do not raise sub-score 03 above what human evidence allows.

Sub-score 05 · Side effect profile

Cumulative direct human exposure under 50 people, or follow-up under 30 days → 05 cannot exceed 3.
No direct human exposure data, safety case mostly animal toxicology or mechanistic inference → 05 cannot exceed 2.
“No adverse events reported” in a tiny pilot is an early tolerability signal, not a mature safety profile.
Any credible serious adverse event signal, unresolved integrity problem, or major uncertainty about the administered material caps the score at 1–2 pending review.

Sub-score 06 · Regulatory status

Track these dimensions separately in page notes and internal data:

medicalApprovalStatus — approval, non-approval, or review status for the studied use
safetyWarningStatus — warning letters, withdrawals, safety alerts, refusals
availabilityStatus — compounding, legal-access, or supply-chain status
sportsProhibitedStatus — WADA or other sports-governing-body status

Sub-score 06 primarily reflects medical-review and safety-regulatory context.
WADA status is compliance information for athletes. It does not imply efficacy or clinical danger by itself.
Legal availability or compounding restrictions alone do not push a grade to F.
Only safety- or efficacy-driven medical regulatory action can materially support a downgrade toward F.

§ 08

From sub-scores to letter

There is no rigid formula. The editorial heuristic:

Letter	Requirements
A	At least two independent high-quality controlled human trials, or one pivotal trial plus an independent confirmatory controlled study of comparable directness. Sub-scores 02 and 03 both ≥ 4, 01 ≥ 4, 04 and 05 both ≥ 3. A single positive Phase 2 trial cannot produce A.
B	At least one well-powered controlled human trial showing clinically meaningful benefit in the relevant grading unit. Sub-scores 02 and 03 both ≥ 3, 01 ≥ 3.
C	Default when some direct human signal exists but meaningful gaps remain, or when strong indirect evidence still carries too much of the case. Common C profile: 01 ≥ 3, 02 = 2–3, 03 = 1–2.
D	Evidence is weak but identifiable — animal-only, mechanistic-only, anecdotal, or sparse uncontrolled human evidence. Requires ≥ 3 qualifying studies (not Pending).
F	Actively negative human evidence, unacceptable risk, or safety-/efficacy-driven medical regulatory rejection or withdrawal for the studied use. Sports prohibition or legal-access constraints alone are not enough.
Insufficient	Internal grading unit cannot be specified cleanly, literature is effectively absent, or decisive sources cannot be verified.

Note

When in doubt, grade DOWN rather than up. Credibility is built on under-promising.

§ 09

Pending vs Insufficient vs D

This boundary must be applied consistently:

D
We looked, and what exists is weak
At least 3 qualifying studies and the rubric supports a weak-evidence letter. Animal data, mechanistic papers, uncontrolled human case series, or anecdotal evidence that is inspectable and clearly limited.
Pending
We have not finished looking
Evidence set has fewer than 3 qualifying studies and editorial backfill is reasonably expected to close the gap.
Insufficient
Evidence set is structurally unworkable
Target cannot yet be meaningfully assessed — claim underspecified, route/population/time horizon unclear, literature effectively absent by design, or decisive sources unverifiable.

§ 10

Reassessment triggers

An internal peptide grade is re-evaluated when ANY of the following occurs:

Trigger	Detection	SLA
New peer-reviewed RCT for that internal grade key	PubMed alerts on peptide name + outcome + route/population terms	Within 30 days
New systematic review or meta-analysis	PubMed alerts	Within 30 days
Claim audit emits an EvidenceConclusion with gradeImpact ≠ "none"	Claim-review workflow handoff	Within 7 days
Retraction of any cited paper	Retraction Watch + manual quarterly check	Same day
Major medical regulatory action (approval, refusal, withdrawal, warning, safety-grounded compounding action)	Regulatory bulletins + monthly check	Same day
Sports-prohibited status change	WADA or governing-body bulletins	Within 30 days (immediate for athlete-safety copy)
PubPeer integrity flag on a cited paper	PubPeer monitoring	Within 7 days
Need to split or narrow the internal grade key (route, population, or time horizon change)	Editorial review or claim audit handoff	Within 14 days
Quarterly housekeeping audit (no specific trigger)	Calendar	At least every 90 days
Reader-submitted evidence challenge via corrections@peptigrade.io	Inbox	Acknowledged in 7 days · resolved in 30

§ 11

The reassessment workflow

When a trigger fires, walk through this:

§ 01
Confirm the trigger is in scope
Read the new evidence, claim-audit handoff, retraction notice, or regulatory action. Confirm it concerns the peptide, outcome, population, route, and time horizon under consideration. A new oral rodent study does not automatically change an injectable human grade.
§ 02
Import the EvidenceConclusion if the trigger came from a claim audit
Do not translate prose by hand when a structured claim-audit output exists. Identify the sub-claim, affected internal grade key, touched sub-scores, and whether the outcome is wording-only, possible band move, or mandatory reassessment. If fields are missing, send it back for completion before changing the grade.
§ 03
Re-score only the affected sub-scores, applying the hard caps
Identify which sub-score(s) the new evidence actually touches. Re-score only those first. Apply the caps in §07 — do not let narrative enthusiasm override them. Write updated justifications in plain language. Examples: a placebo-controlled human RCT typically touches 02 and 03; longer follow-up touches 04 and maybe 05; a safety warning affects 05 and 06.
§ 04
Re-roll the letter grade
Apply the heuristic in §08. The letter may change up, down, or stay the same. A single positive Phase 2 trial can plausibly move C → B if well-powered and clinically meaningful; it does not move C → A on its own. If the internal grade key needs to be split by route/population/time horizon first, do that, then re-roll each resulting grade separately.
§ 05
Determine whether the publication packet changes
Four common outcomes: wording correction only (update copy, citations, lastReviewed; no grade-history entry); grade unchanged with updated sub-scores (update notes and lastReviewed, no editorial note required); grade up or down by one letter (add Grade history entry, editor + second-editor review required); grade change of two+ letters or any move to/from A or F (Grade history entry, full editorial-board review, publish a CHANGE NOTE).
§ 06
Update related artifacts
Grade changes ripple. Update the peptide's topGrade if this was the top outcome and the grade moved. Update any /protocols/[slug] page that includes the peptide as a component. Update any /claims/[slug] page that depends on the affected sub-claim. Update the home-page carousel and featured-peptides section if featured. Regenerate sitemap.xml on the next build.
§ 07
Editorial sign-off
Wording correction only → author editor. No grade change with sub-score updates → author editor. One-letter change → author editor + one second editor. Two+ letter change or any move to/from A or F → author editor + second editor + clinical advisor. Retraction-, integrity-, or safety-driven change → same chain plus same-day publication once verified.

§ 12

Ownership & authority

The grading layer needs clear authority boundaries:

Decision	Owner
Routine sub-score refresh with no letter change	Primary evidence editor for that peptide
Any letter-grade movement	Primary evidence editor proposes · second editor approves
Any move to/from A or F, or any safety-/integrity-driven downgrade	Primary evidence editor + second editor + clinical advisor
Override of a claim-audit arbitrator, or override of gradeImpact = "mandatory_reassessment"	Editorial lead + clinical advisor · rationale logged in Grade history
Publication of a medical- or safety-related grade change	Editorial lead owns release and timing
Quarterly stale-review sweep	Managing editor or designated evidence-ops owner

Note

No single person should author the triggering claim audit, arbitrate the claim audit, and approve the resulting grade change alone.

§ 13

Versioning & audit trail

Every grade change must leave a trail. The pattern:

Grade history entry

What a clean change log should capture

Not software for software's sake. Just the minimum record needed so a future reader can understand what changed, why it changed, and who approved it.

Field	Editorial Meaning	Why It Matters
When	The date the grade changed	An ISO-stamped moment in the public record.
What moved	The exact internal grading unit	Not just the peptide page broadly, but the specific row that changed.
From → to	The prior letter and the new letter	Readers should be able to see direction, not just the latest state.
Why now	The trigger: paper, regulatory event, or claim-audit run	Every shift needs a traceable cause.
What changed underneath	The affected sub-scores and linked audit outputs	The movement should be reconstructible, not just asserted.
Editorial rationale	A one- or two-sentence explanation of the move	Short enough to scan, precise enough to defend.
Approval trail	Who signed off	Especially important for safety-led downgrades or moves into A or F.

We do not yet store this in the codebase — currently each peptide outcome has only lastUpdated. Next step: extend the OutcomeGrade type with an optional history: GradeHistoryEntry[] field, render the history on the peptide page as a small Grade history panel under the grade matrix, and surface it in the Drug JSON-LD.

§ 14

Errata vs reassessment vs claim audit

Three distinct operations, three different workflows:

Operation	When	Workflow
Errata	A factual error is found on a published page: wrong PMID, dose number, author name, label text.	Fix the error, append a dated correction note, mention in the next weekly dispatch. No grade change. Owner: any editor.
Reassessment	New evidence or a structured claim-audit handoff triggers re-evaluation of a published internal peptide grade.	This document, Steps 1–7. Owner: editor responsible for the peptide.
Claim audit	A popular framing or specific assertion needs adjudication.	New /claims/[slug] page following the research protocol. If grade-relevant, it emits EvidenceConclusion objects. Owner: editor + clinical advisor for the relevant peptide.

§ 15

What never raises a grade

These never count as supporting evidence regardless of how persuasive they sound:

Influencer testimonials
Vendor marketing copy
Forum or social-media anecdotes
Vendor-funded white papers without peer review
“It worked for me” reports
Mechanistic plausibility absent direct evidence
Review articles cited as if they were decisive primary efficacy evidence

These can be cited as context but they cannot move sub-scores 02, 03, 04, or 05 in a positive direction.

§ 16

Protocol-level evidence labels

A multi-compound protocol is not a peptide × outcome pair. Borrowing the A–F peptide letters for a protocol would imply direct regimen-level human-outcome evidence that almost never exists.

Protocols use a distinct label set (see /protocols):

Label	Meaning
Exploratory synthesis	Combination of research-backed hypotheses; the regimen as assembled has not been tested as a unit in a controlled human trial.
Mechanistically plausible	Components act on characterized pathways; protocol-level human outcome evidence is absent.
Emerging clinical evidence	Some protocol-level human outcome data exists, but it is under-powered or preliminary.
Established protocol	Validated protocol-level human RCT evidence.

Watch

Compound-level peptide grades and the protocol evidence label are scored separately. A protocol may contain one C-grade peptide, two D-grade peptides, and still carry an “Exploratory synthesis” label because the full stack has not been tested together. Do not aggregate letter grades across compounds into a protocol “top grade.”

§ 17

Worked example: BPC-157 × tendon healing

Public label: BPC-157 × tendon healing

Internal grade key, simplified: BPC-157 × tendon healing × adults with musculoskeletal injury × injectable/systemic exposure × acute/subacute healing window

Current state: Grade C, sub-scores 4 / 2 / 1 / 2 / 2 / 1, top outcome.

Sub-score	Value	Justification
01 Mechanism	4	FAK-paxillin, VEGFR2-Akt-eNOS, and GH-receptor-related pathways are characterized in animal and in-vitro systems, but human PK and a confirmed receptor story remain absent.
02 Human studies	2	Sparse direct human data exists, but no well-powered controlled human trial for this grading unit.
03 Effect vs placebo	1	No placebo- or comparator-controlled human efficacy trial exists for the relevant grading unit. Animal-versus-sham consistency helps context, not this score.
04 Long-term safety	2	Human exposure windows are short, with no robust longer-term follow-up.
05 Side effect profile	2	Small human exposure and short follow-up mean the observed tolerability signal is still low-certainty, even if animal toxicology looks favorable.
06 Regulatory status	1	Medical approval is absent. Safety/access/sports notes are tracked separately, and WADA status does not by itself imply inefficacy or danger.

Why C and not B

The B rubric requires at least one well-powered controlled human trial plus sub-scores 02 and 03 both ≥ 3. BPC-157 × tendon healing does not meet that bar. Strong and replicated animal evidence plus sparse uncontrolled human signal is a canonical C, not a B.

What would move it to B

A positive, well-powered controlled Phase 2 or Phase 3 human trial showing clinically meaningful benefit in the relevant grading unit could plausibly move 02 to 3–4 and 03 to 3, producing a likely C → B reassessment if safety remains acceptable.

What would move it to A

At least one additional independent high-quality controlled human trial pointing the same way, plus stronger longer-term human safety data. A single positive Phase 2 trial would not be enough.

What would move it down

A decisive high-quality null RCT, a credible serious safety signal, or a safety-/efficacy-driven medical regulatory action could force reassessment toward D or F depending on how conclusive the new evidence is.

How letters getassigned and updated.

The grading unit

One public grade, five editorial qualifiers

Operational rules

The letter grades

Minimum evidence threshold · Pending status

What counts as a qualifying study

Additional quality gates on top of the count

Pending vs Insufficient

Promotion out of Pending

Why Pending and not a lower letter

Bridge from claim audits to grade reassessment

What the grading team needs from an audit

Operational rules

The six sub-scores

Scoring scale per sub-score

Hard caps & edge-case rules

Sub-score 03 · Effect vs placebo

Sub-score 05 · Side effect profile

Sub-score 06 · Regulatory status

From sub-scores to letter

Pending vs Insufficient vs D

We looked, and what exists is weak

We have not finished looking

Evidence set is structurally unworkable

Reassessment triggers

The reassessment workflow

Confirm the trigger is in scope

Import the EvidenceConclusion if the trigger came from a claim audit

Re-score only the affected sub-scores, applying the hard caps

Re-roll the letter grade

Determine whether the publication packet changes

Update related artifacts

Editorial sign-off

Ownership & authority

Versioning & audit trail

What a clean change log should capture

Errata vs reassessment vs claim audit

What never raises a grade

Protocol-level evidence labels

Worked example: BPC-157 × tendon healing

Why C and not B

What would move it to B

What would move it to A

What would move it down

How letters get
assigned and updated.