Full Disclosure

How We Work

Traditional advisory firms publish conclusions. We publish everything — the methodology, the evidence quality, the confidence scores, the limitations, and the cost of production. This page exists because our architecture makes transparency cheaper than opacity.

Outcome-Anchored Vendor Evaluation

We score vendors on one thing: verified outcomes in production. Not feature lists. Not analyst opinions. Not market share. The weighting reflects what actually determines whether an enterprise AI deployment succeeds.

50%Verified Outcomes

Named-company production deployments with measurable business results. Anonymized case studies receive a 0.5× discount. Vendor self-reported results receive 0.25×.

20%Speed to Outcome

Time from purchase decision to measurable business value in production. Sub-6 months scores highest. Over 18 months scores lowest.

30%Risk Adjustment

Scale Durability (20%) — can the deployment maintain performance at enterprise scale? Economic Risk (20%) — total cost of ownership trajectory. Continuity Risk (10%) — vendor viability and lock-in exposure.

Evidence Quality Tiers

Not all evidence is created equal. A vendor press release and an SEC filing both count as “sources” — but they carry very different weight. We apply discount factors to every piece of evidence.

Primary

1.0×

SEC filings, patent records, peer-reviewed papers, audited financials, official earnings transcripts

~45% of evidence base

Independent

0.5×

Credible journalism (Reuters, WSJ, FT), independent analyst reports, conference proceedings, academic research

~35% of evidence base

Vendor-Provided

0.25×

Press releases, vendor case studies, marketing materials, self-reported metrics without independent corroboration

~20% of evidence base

Confidence Calibration

Every claim in our report carries a confidence score. These are not decorative. They are commitments to calibrated honesty about what we know and what we don't.

Score	Meaning	Required Evidence
0.90–1.00	Near-certain	Multiple independent primary sources, no credible contradictions
0.70–0.89	High confidence	At least 1 primary + corroborating secondary sources
0.50–0.69	Moderate	Secondary sources agree, no primary confirmation
0.30–0.49	Low — publishable with caveat	Limited sourcing, some ambiguity
0.00–0.29	Unpublishable	Insufficient evidence — never appears in published output

AI Systems Used

This report was produced by AI systems. We disclose which models were used, for what purpose, and in what proportion. No human wrote the prose. Humans set the methodology, provided practitioner intelligence, and serve as the quality backstop.

Claude (Anthropic)~60%

Primary research synthesis, report writing, editorial review, strategic analysis

GPT-4 (OpenAI)~25%

Cross-validation, alternative perspective generation, quality gate deliberation

Lightweight models~15%

Classification, entity extraction, formatting, data transformation

The Pre-Ship Gate

Before any report section ships, it must pass these 10 questions. We derived them from the patterns our Board Chairman uses to evaluate work. If any answer is unsatisfactory, the deliverable goes back for rework.

Has this been pressure-tested through multiple self-critique cycles?

What blind spots exist? What would a hostile critic point to?

Does every claim trace to a real deployment, a real number, a real company?

Would a Fortune 500 CIO forward this to their board?

Are we being lazy with the methodology?

Does the system improve itself, or does it need external intervention to catch everything?

Can you trace from high-level claim to source data and back?

Are the visuals and design first-class?

What does this look like from the reader's specific context?

Are we leaning all the way in, or hedging?

Known Limitations

We are transparent about what we cannot do. Publishing limitations is not a weakness — it is the mechanism by which trust is built. A finding presented with false certainty is more dangerous than no finding at all.

We cannot read intent behind public statements or detect organizational dysfunction from tacit signals.

Our vendor evaluations rely on publicly available evidence. Vendors with poor public documentation may be underscored relative to their actual capabilities.

Evidence ages. Data points older than 12 months are flagged but may still influence scores if no newer data exists.

Our confidence scores are calibrated estimates, not probabilities. A 0.75 confidence does not mean 75% chance of being correct.

We have no primary research (surveys, interviews) in v1. All evidence is secondary or tertiary. This is a known limitation we plan to address.

Our AI system has intrinsic limitations in generating genuinely novel conceptual frameworks. Our frameworks are synthesized from existing research, not invented from first principles.

Corrections & Updates

Errors are inevitable. The measure of integrity is not perfection but velocity of correction. Any material error is corrected within 24 hours with a transparent notice.

No corrections issued yet. This section will be updated as needed.

Questions about our methodology? Disagree with a finding? Found an error?

Contact Our Research Team