How We Research and Grade Supplements

Every claim on this site gets an evidence grade from A to D and an Impact Score from Strong to None. Here is how they work and why they matter.

The short version

We read the studies so you don't have to. Every claim on this site gets two scores: an evidence grade from A (strong research support) to D (not enough data), and an Impact Score from Strong to None (how much the supplement actually moves the needle). We show our work, link every source, and don't sell supplements. That's the deal.

Why we built our own grading system

Most supplement content falls into two camps: academic papers nobody reads, or marketing dressed up as health advice. We wanted something in between — rigorous enough to be honest, clear enough to be useful.

We built SB-EGS by taking the best elements from established frameworks: the domain-based scoring from GRADE, the nutrition-aware approach from NutriGrade, the plain-language communication from NESR, and the study-design hierarchy from Oxford CEBM.

The result is a 5-domain, 20-point system that produces letter grades any reader can interpret, while maintaining the rigor of established evidence evaluation. The current rubric version is sb-egs-v1.1. Starting with v1.1, we also surface an Impact Score alongside every grade.

How we grade evidence

Every supplement-outcome claim gets scored across five dimensions. The total score maps to a letter grade.

The five dimensions

Study Design Quality (0-5 points)

What kind of research exists? Meta-analyses of RCTs score highest. Single pilot studies score lowest.

Sample Size (0-4 points)

How many people were studied? More participants means more confidence.

Consistency (0-4 points)

Do the studies agree? If most studies find the same thing, that's consistent.

Effect Size (0-4 points)

Does it actually make a meaningful difference? We care about real-world impact, not just p-values.

Directness (0-3 points)

Was the research done on people like you, taking the supplement the way you'd take it?

From score to grade

Score	Grade	What it means
17-20	A	Strong evidence
15-16	A-	Strong-to-solid
13-14	B+	Solid evidence
11-12	B	Moderate evidence
9-10	B-	Moderate-to-emerging
6-8	C	Preliminary
0-5	D	Insufficient

Impact Score: does it actually work?

Evidence Grade tells you how sure we are. Impact Score tells you how much it matters. A supplement can have excellent evidence (A grade) for a tiny effect, or shaky evidence (C grade) for a large one. You need both numbers to make a good decision.

Impact is derived from the Effect Size dimension we already score (0-4), mapped to the Cohen's d ranges the research agent records for every study. We surface it as a plain-language label so you don't need to interpret the statistics.

The Impact Scale

Impact	Cohen's d	What it means
Strong	≥ 0.8	Noticeable improvement most people will feel
Meaningful	0.5 – 0.8	Clear benefit — worth trying for most people
Modest	0.35 – 0.5	Real but gentle — works for some, subtle for others
Slight	0.2 – 0.35	Barely detectable — may not be noticeable day-to-day
None	< 0.2	No statistically significant effect demonstrated

How we show both scores

Every supplement-goal pair on the site displays Impact first, Evidence second — because "does it work?" is the question most people care about first. Example byline:

IMPACT · MEANINGFUL — EVIDENCE B+

The supplement delivers a moderate real-world effect, backed by solid research.

When space is tight (search results, small cards), we use a compact dot rating for Impact alongside the letter grade for Evidence. The full label always appears on the article page itself.

Reading the two scores together

High Impact + Strong Evidence

The best case. The supplement works meaningfully and we are confident it works. Worth prioritizing in your stack.

High Impact + Weak Evidence

Promising but unproven. The studies that exist show a large effect, but there aren't enough of them yet. Watch this space.

Low Impact + Strong Evidence

Well-studied but modest benefit. The supplement does something, just not much. Don't expect dramatic results.

Low Impact + Weak Evidence

Skip it. Either the research is too thin to trust, or the effect is too small to matter, or both.

Hard rules that override scores

•Only one study exists? Cap at B, no matter how good it was.
•Only animal or lab studies? Cap at C.
•No human studies at all? That's a D.
•Every study funded by the manufacturer with no independent replication? Downgrade by one letter.
•If effect_size <= 2/4 (i.e., the actual real-world impact is 'small' or smaller, Cohen's d < 0.5 or <15% improvement), cap the grade at B. Strong research support for a tiny effect must not earn an A — the grade label promises 'Research strongly supports' but a tiny effect doesn't deliver what users expect from an A. Added 2026-04-19 (sb-egs-v1.0.1) after Vitamin D/cardiovascular shipped at A- with buddy_line literally saying 'no meaningful effect'.

Safety is separate

A supplement can have an A grade for effectiveness and still have safety concerns. We flag safety independently as Generally Safe, Caution, or Warning.

Where we get our data

Primary sources

•PubMed/MEDLINE — the world's largest biomedical research database
•NIH Office of Dietary Supplements — government fact sheets, RDA values, safety data
•Supp'Buddy Evidence Database — our curated database, continuously updated

Cross-reference sources

•Examine.com, SUPP.AI, and CrossRef/DOI verification

What we don't do

•We don't sell supplements. No affiliate links, no sponsored content.
•We don't give medical advice. Evidence grades tell you how strong the research is.
•We don't hide uncertainty. If the evidence is mixed, we say it's mixed.
•We don't cherry-pick. If studies disagree, you'll hear about all of them.

Questions?

If you want to understand a specific grade or think we got something wrong, reach out at contact@supp-buddy.com. We take corrections seriously.

See our grading in action

Every article on this site uses the grading system above. Start here:

Deep Dives Protocols

The SB-EGS methodology was developed by the Supp'Buddy Research & Editorial Team, drawing on GRADE, NutriGrade, NESR, and Oxford CEBM Levels of Evidence — adapted for consumer supplement research.