DATA & METHODOLOGY

How GovBiz.ai builds intelligence.

Every recommendation traces to a specific, citable federal data signal. If the data doesn't support a recommendation, we say so — we don't fill gaps with generic capture playbook advice.

CPARS Data — Not Incorporated

CPARS performance ratings are not publicly accessible. Any platform claiming to incorporate CPARS data is misleading their users. GovBiz.ai uses publicly verifiable proxy signals instead and documents this explicitly.

Data Sources

SAM.gov — System for Award Management

Daily opportunity feed · Monthly entity bulk extract
  • Registered entity profiles: UEI, CAGE code, legal name, address
  • NAICS code assignments (primary + secondary)
  • SBA small business certifications: 8(a), SDVOSB, HUBZone, WOSB, with expiration dates
  • Active solicitations, pre-solicitations, and award notices
  • Registration status (active/expired/excluded)

FPDS — Federal Procurement Data System

Incremental daily · Full history from FY2018
  • Contract award records: PIID, agency code, vendor UEI, award date, award amount
  • Period of performance start and end dates
  • NAICS code, type of set-aside, extent competed
  • Number of offers received at time of award
  • Modification records: modification number, reason code, value change, PoP extension
DoD awards have a 90-day public release delay. GovBiz.ai flags affected contracts when data lag may affect signal accuracy.

USASpending.gov

Daily incremental · Monthly agency summaries
  • Agency obligation totals by NAICS code and fiscal year
  • Award transaction details and recipient information
  • Sub-award data where available

GSA CALC+ — Contract-Awarded Labor Category

Nightly full refresh
  • Labor rates by contractor, labor category, education level, and years of experience
  • 50,000+ rate records across GSA schedule holders
  • Used as the primary benchmark for price-to-win labor rate analysis

Bureau of Labor Statistics — OES Program

Annual release (May data, published following year)
  • Occupational Employment and Wage Statistics by SOC code and geographic area
  • Used to validate GSA CALC+ labor rates against current market wages
  • Identifies when GSA rates may be stale relative to current labor market

Incumbent Vulnerability Scoring

Included in Max — Vulnerability scoring is available to all subscribers.

Vulnerability scores are computed by a rule-based scoring engine — not a machine learning model. Each signal is independently calculated and weighted. The rules are deterministic and auditable.

Modification Frequency

25%

Contracts with more than 3 modifications per year signal instability. Each modification is a data point in the public FPDS record. High-frequency modification patterns correlate with scope creep, performance issues, or budget instability.

Period of Performance Extensions

20%

Bridge contracts and PoP extensions (modification reason P00xxx) indicate the government delayed re-competition. Two or more extensions substantially raise vulnerability probability — the government is continuing a contract it chose not to recompete on schedule.

Time Since Last Competition

15%

Contracts that have not been competitively re-awarded in 5+ years represent structural vulnerability. Cross-referenced with the original award date and subsequent modifications.

Agency Small Business Goal Gap

15%

Agencies with SB goal shortfalls have regulatory pressure to set aside contracts. Contracts held by large businesses in agencies below their SB goals are disproportionately vulnerable to set-aside reclassification at recompete.

Award Value Trend

10%

Declining award values across modifications (after adjusting for scope changes) may indicate the government is reducing scope in anticipation of transition, or is dissatisfied with cost performance.

Prior Recompete Loss History

15%

If the current incumbent lost a prior competition on the same vehicle or for the same agency/NAICS combination, that history is a forward-looking vulnerability indicator.

Score Interpretation

70–100: High vulnerability
40–69: Moderate vulnerability
0–39: Low vulnerability

If a signal cannot be computed due to missing data, its contribution is set to 0.0 and the absence is recorded in the evidence chain. Scores are never imputed or estimated.

Price-to-Win Methodology

Included in Max — Price-to-Win analysis is available to all subscribers.

PTW ranges are derived from two primary sources: FPDS historical award values for the same NAICS code and agency combination, and GSA CALC+ labor rate benchmarks cross-referenced with BLS wage data.

GovBiz.ai presents PTW as ranges, not point estimates. False precision in PTW analysis is a common failure mode — a $4.2M award does not mean the next award will be $4.2M. We show the distribution of historical awards and the percentile range, not a single number.

Example: For a NAICS 541512 IT services contract at a civilian agency, PTW range would be expressed as: "P25–P75 of 36-month FPDS actuals: $3.1M–$5.4M per year. GSA CALC+ labor rate range for senior developer / 10yr experience: $142–$198/hr."

Opportunity Scoring

Opportunity scores are contractor-specific. The same SAM.gov solicitation receives a different score for different companies. Factors and weights:

NAICS alignment with company profile40 pts
Set-aside alignment with company certifications25 pts
Geographic preference match15 pts
Days until response deadline20 pts
Score ranges: Scores ≥60 are high priority, 40–59 are watch-list, and <40 are monitored. A score of 0 typically means no NAICS match.
BD MANAGER — Opportunity scoring is included for all subscribers. Scored, matched, urgency-grouped opportunities delivered daily.

How Vulnerability Signals Work

Included in Max — Vulnerability scoring is available to all subscribers.

Each signal is independently measured against a threshold. Signals that exceed their threshold contribute their full weight; signals below threshold contribute proportionally. Component scores are summed and capped at 100.

Modification Frequency (25%)

ThresholdMore than 3 modifications per yearSourceFPDS modification recordsExampleA contract with 7 modifications in 2 years triggers a high signal

PoP Extensions (20%)

Threshold2 or more bridge/extension modificationsSourceFPDS modification reason codesExampleTwo PoP extensions on a single contract contribute the full 20 points

Time Since Competition (15%)

Threshold5 or more years since last competitive awardSourceFPDS award historyExampleA contract last competed 7 years ago receives the maximum signal

Agency SB Goal Gap (15%)

Threshold5 percentage point shortfall vs. SBA goalsSourceUSASpending agency data vs. SBA small business goalsExampleAn agency 4.2pp below their small business goal scores 12.6 of 15 points

Award Value Trend (10%)

Threshold20% decline in award value across modificationsSourceFPDS modification award valuesExampleAn 18% declining trend contributes 9 of 10 possible points

Prior Recompete Loss (15%)

Threshold1 prior competition loss on same vehicleSourceFPDS award cross-referenceExampleAn incumbent who lost a prior recompete receives the full 15 points
Missing data: If a signal cannot be computed due to missing records, its score is set to 0.0 and the gap is surfaced explicitly in every report. Scores are never imputed or estimated.

Recompete Timing Prediction

Included in Max — Recompete timing prediction is available to all subscribers.

Recompete timing predictions are derived from FPDS data — not from any non-public government planning documents. The prediction method:

1
Extract PoP end date
FPDS base award record contains the original period of performance end date.
2
Identify bridge extensions
P00xxx modification reason codes extend the PoP. Each detected extension shifts the predicted recompete date forward by the extension duration.
3
Apply agency lead time
Agencies typically release solicitations 12–18 months before PoP end. Lead time varies by agency (DoD typically 12 months, civilian agencies 18 months).
4
Cross-reference spending forecast
Declining USASpending obligation trends for the agency/NAICS combination reduce confidence — an agency cutting spend may delay recompete or reduce scope.
5
Output
A predicted recompete date, a confidence score (0.0–1.0), and a plain-text explanation citing the specific data points that informed the prediction.
80–100%
High confidence — clean PoP, no extensions, stable spend
50–79%
Moderate — extension history or spend uncertainty
< 50%
Low — multiple extensions or declining spend

Prophet Spending Forecasts

Included in Max — Spending forecasts are available to all subscribers.

Agency spending forecasts use Meta's Prophet time-series model, one model per (agency_code, NAICS_code) pair. Training data comes from FPDS obligation records.

Training dataFPDS obligation records, FY2018–present (52M+ records)
Horizon18 months forward projection
Confidence interval80% — presented as ranges, not point estimates
Holiday eventsFY2020–FY2021 registered as COVID-era anomalies; FY2023 registered for CR/debt ceiling disruption
Retrain cadenceMonthly (1st of each month), only if MAPE change > 5%
Model fitnessAccuracy tracked via MAPE and RMSE; anomalies flagged automatically
Limitation: Spending forecasts reflect historical obligation patterns. They do not incorporate Congressional appropriations data, agency strategic plans, or classified budget documents. Forecasts are input to confidence scoring — they are not independently actionable without FPDS signal context.

What We Don't Use — and Why

CPARS Performance Ratings

Not publicly accessible. The Contractor Performance Assessment Reporting System is restricted to government contracting officials. Any platform claiming to incorporate CPARS data is misleading users. We explicitly document this limitation and use the proxy signals above instead.

Our approach: Proxy approach: FPDS modification frequency and PoP extensions serve as observable behavioral proxies for performance. Frequent modifications and bridge extensions are FPDS-visible consequences of the same issues that CPARS would rate.

Agency Strategic Plans / Budget Projections

Non-public and inconsistently published. While some agencies publish acquisition forecasts, these are advisory and unreliable for quantitative modeling.

Our approach: Proxy: Prophet model on historical FPDS obligations as the best available public signal for spending trajectory.

Proposal Win/Loss Data

Proprietary to each contractor. We have no mechanism to collect it and do not try. Win probability is not a number we compute — we compute vulnerability of the incumbent, which is a distinct and more reliable signal.

Our approach: No proxy — we do not produce win probability scores. Vulnerability score is the incumbent-side metric; opportunity priority is the opportunity-side metric.

Teaming/JV Structure of Competitors

Not consistently publicly disclosed before award. Post-award JV data is in FPDS but is unreliably structured.

Our approach: Proxy: SAM.gov entity search for certification holders in the target NAICS/agency combination, cross-referenced with FPDS award history at that agency.

LLM Synthesis Layer

The Claude API is used as the final synthesis layer — it translates pre-computed analytical outputs into capture narratives. The LLM does not do analysis. It receives scored results and evidence from the layers below and narrates them in capture manager language.

If the underlying data has a gap (missing records, insufficient signal), the LLM is instructed to surface that gap explicitly — not fill it with generic advice. Capture managers who encounter generic advice in the output should report it as a bug.

How it works

Public federal data flows through our rule-based scoring engine, which produces scored signals with cited evidence. The LLM then translates those scores into actionable capture narratives — it never invents analysis or fills gaps with generic advice.

Every recommendation in the output traces back to a specific scored signal and its underlying data source.