---
When you buy food, there's a nutrition label. When you buy a car, there's a window sticker with specifications. When you take medicine, there's a drug facts label. When you use an AI model, there is no equivalent — you don't know what data trained it, who trained it, what biases it has, what it costs to run, or what it will do with your inputs. We propose an "AI Model Bill of Materials" (AI-BOM): a standardized, machine-readable document that accompanies every AI model deployment, disclosing training data provenance, model architecture, known biases, energy costs, privacy properties, and operational parameters. We present BlackRoad OS's implementation (which uses exclusively open-weight models with published training details) as a reference architecture, compare to the opacity of closed models (GPT-4, Claude, Gemini), and argue that the EU AI Act's transparency requirements (Article 13) and the US Executive Order on AI (14110) create regulatory momentum toward mandatory AI-BOMs. We propose a specific schema (JSON-LD, 23 required fields, 15 optional fields) and demonstrate it with the 8 models deployed on the BlackRoad fleet.
| Product | Transparency | Label/Standard |
|---------|-------------|---------------|
| Food | Ingredients, calories, allergens | FDA Nutrition Facts (21 CFR 101) |
| Medicine | Active ingredients, dosage, side effects | FDA Drug Facts |
| Car | MPG, emissions, safety rating, price | EPA/NHTSA window sticker |
| Electronics | Power consumption, materials | EU Energy Label, RoHS |
| Software | Open source: full source code | OSI licenses |
| AI Model | Almost nothing | No standard |
| Question | Answer |
|----------|--------|
| What data was it trained on? | Unknown (OpenAI won't say) |
| How many parameters? | Unknown ("rumored" ~1.8T MoE) |
| What are the known biases? | Partially disclosed in model card |
| How much did training cost? | Unknown ($50-100M estimated) |
| How much energy did training use? | Unknown |
| What happens to my input? | Stored by OpenAI (privacy policy) |
| Can my input be used for retraining? | Depends on settings (opt-out available) |
| What safety measures exist? | RLHF, red-teaming (details limited) |
| Question | Answer |
|----------|--------|
| What data was it trained on? | Publicly documented (Meta's training data disclosure) |
| How many parameters? | 3.21 billion (exact, verifiable) |
| What are the known biases? | Documented in model card |
| How much did training cost? | Estimated ~$2-5M (based on compute disclosure) |
| How much energy did training use? | Documented in Meta's carbon report |
| What happens to my input? | On BlackRoad: stays on your hardware. Never leaves. |
| Can my input be used for retraining? | On BlackRoad: impossible (local inference, no telemetry) |
| What safety measures exist? | Documented: RLHF, safety fine-tuning, eval benchmarks |
The open-weight model is transparent by design. The closed model is opaque by business decision.
``json
{
"@context": "https://schema.blackroad.io/ai-bom/v1",
"@type": "AIModelBillOfMaterials",
"model": {
"name": "llama3.2:3b",
"version": "3.2",
"provider": "Meta AI",
"license": "Llama 3.2 Community License",
"parameters": 3210000000,
"architecture": "Transformer (decoder-only)",
"quantization": "Q4_K_M",
"fileSize": "2.0 GB",
"contextWindow": 131072,
"trainingCutoff": "2024-03"
},
"training": {
"dataDescription": "Publicly available web data, curated datasets",
"dataSize": "15T tokens",
"knownBiases": [
"English-centric (70% English training data)",
"Western cultural bias in value alignment",
"Tends to be verbose"
],
"safetyMeasures": [
"RLHF (Reinforcement Learning from Human Feedback)",
"Red-team evaluation (1000+ adversarial testers)",
"Safety fine-tuning on refusal dataset"
],
"energyUsed": "Estimated 500 MWh",
"carbonFootprint": "Estimated 200 tonnes CO2"
},
"deployment": {
"runtime": "Ollama 0.5.x",
"hardware": "Raspberry Pi 5 + Hailo-8 NPU",
"inferenceSpeed": "15-20 tokens/second",
"powerConsumption": "15W (system total)",
"dataRetention": "None — all inference is local, no telemetry",
"inputPrivacy": "Inputs never leave user's hardware",
"outputStorage": "PS-SHA∞ journal (user-owned, encrypted)"
},
"evaluation": {
"benchmarks": {
"MMLU": 0.63,
"HellaSwag": 0.78,
"ARC-Challenge": 0.55,
"TruthfulQA": 0.49,
"HumanEval": 0.31
},
"limitations": [
"Not suitable for medical, legal, or financial advice",
"May hallucinate facts — verify important claims",
"3B parameters is significantly less capable than 70B+ models",
"Limited multilingual capability outside English"
]
}
}
`
`json
{
"provenance": {
"trainingDataURLs": ["https://..."],
"modelCardURL": "https://huggingface.co/meta-llama/...",
"paperURL": "https://arxiv.org/abs/...",
"auditReports": [],
"thirdPartyEvaluations": []
},
"economics": {
"trainingCost": "$2-5M estimated",
"inferenceCostPerToken": "$0.000001 (electricity only)",
"inferenceCostPer1000Tokens": "$0.001",
"comparisonToCloud": "1000x cheaper than GPT-4 API"
},
"ethical": {
"contentPolicy": "Refuses harmful content generation",
"refusalCategories": ["violence", "CSAM", "weapons", "malware"],
"politicalBias": "Tends center-left on political topics (documented)",
"culturalBias": "Western, English-dominant",
"mitigation": "Users should verify outputs, especially for sensitive topics"
}
}
`
The AI-BOM is JSON-LD, making it:
in HTML headers`html`
| Model | Params | Quantization | AI-BOM? | Key Limitation |
|-------|--------|-------------|---------|---------------|
| llama3.2:3b | 3.2B | Q4_K_M | ✓ Full | Less capable than GPT-4 on complex reasoning |
| llama3.2:1b | 1.2B | Q4_K_M | ✓ Full | Very limited capability, best for simple tasks |
| phi3:mini | 3.8B | Q4_K_M | ✓ Full | Microsoft model, strong on code/reasoning |
| gemma2:2b | 2.6B | Q4_K_M | ✓ Full | Google model, competitive at small size |
| nomic-embed-text | 137M | fp16 | ✓ Full | Embedding only, no text generation |
| llava | 7B | Q4_K_M | ✓ Full | Vision model, slow on Pi hardware |
| codellama | 7B | Q4_K_M | ✓ Full | Code-specialized, limited general knowledge |
| mixtral:8x7b | 46.7B MoE | Q4_K_M | ✓ Full | Large, runs only on Gematria (VPS) |
Every model in the BlackRoad fleet has a complete AI-BOM. No model is used without disclosed provenance.
When a BlackRoad user interacts with the tutor, chat, or any AI-powered feature, they can:
1. See which model is responding (displayed in UI)
2. Click to view the full AI-BOM (JSON-LD document)
3. Verify the model is open-weight (license field)
4. Confirm data stays local (deployment.dataRetention field)
5. Understand limitations (evaluation.limitations array)
This is the digital equivalent of reading the nutrition label before eating.
Article 13 requires "high-risk AI systems" to provide:
An AI-BOM satisfies ALL of these requirements in a single machine-readable document.
The EO requires:
An AI-BOM provides a standardized format for this reporting.
| State | Law | AI-BOM Relevance |
|-------|-----|-----------------|
| California | SB 1047 (vetoed, reintroduced) | Would require safety evaluations — AI-BOM documents these |
| Colorado | AI Act (2024) | Requires disclosure of AI use — AI-BOM is the disclosure mechanism |
| Illinois | AI Video Interview Act | Requires disclosure when AI evaluates candidates — AI-BOM covers this |
| NYC | Local Law 144 | Requires bias audit for AI hiring tools — AI-BOM includes bias documentation |
An AI-BOM doesn't just satisfy regulations — it PREVENTS the need for adversarial compliance. Instead of regulators discovering what's in your AI through investigation, you publish it proactively. This is the transparency equivalent of anti-persuasion architecture: structural compliance rather than reactive compliance.
| AI-BOM Field | Open Model | Closed Model |
|-------------|-----------|-------------|
| Parameters | Exact number published | "Estimated" or "undisclosed" |
| Training data | Documented | Trade secret |
| Known biases | Community-tested, documented | Partially disclosed by provider |
| Training cost | Estimable from compute | Trade secret |
| Energy use | Documented | Usually undisclosed |
| Input privacy | Verifiable (local inference) | Policy promise (unverifiable) |
| Safety measures | Published methodology | "We did red-teaming" (details limited) |
A closed model's AI-BOM is inherently incomplete because the provider has business incentives to withhold information. An open model's AI-BOM can be verified by anyone.
When OpenAI says "GPT-4 is safe" — you trust their word.
When Meta says "Llama 3.2 is safe" — you can verify by reading the code, running the evals, and testing the model yourself.
The difference is not that open models are safer (they may or may not be). The difference is that open models are VERIFIABLE. And verifiability is the precondition for trust.
`bashGenerate AI-BOM from Ollama model info
ollama show llama3.2:3b --json > model-info.json
6.2 For Closed Model API Users
`json
{
"@context": "https://schema.blackroad.io/ai-bom/v1",
"model": {
"name": "gpt-4o",
"provider": "OpenAI",
"parameters": "undisclosed",
"trainingData": "undisclosed",
"license": "proprietary"
},
"deployment": {
"dataRetention": "per OpenAI data usage policy",
"inputPrivacy": "inputs sent to OpenAI servers"
},
"transparency": {
"score": 2,
"maxScore": 10,
"missingFields": ["parameters", "trainingData", "energyUsed", "trainingCost"],
"note": "This model is closed-source. AI-BOM is incomplete by design."
}
}
``The transparency score (2/10 for GPT-4o vs 9/10 for llama3.2:3b) gives users an at-a-glance measure of how much they know about the AI they're using.
The AI industry is 18 months away from mandatory transparency requirements (EU AI Act enforcement begins August 2026). Companies that build transparency infrastructure now will have a compliance advantage. Companies that resist will face the same scramble that GDPR created for data controllers in 2018.
The AI-BOM is not a burden — it is a competitive advantage for transparent providers and an accountability mechanism for opaque ones. It costs nothing to implement for open-weight models (the information is already public) and everything to implement for closed models (the information is intentionally hidden).
BlackRoad OS publishes complete AI-BOMs for every model in its fleet — not because regulation requires it (it doesn't, yet) but because the anti-persuasion architecture demands it: if you're going to use AI on behalf of a user, the user should know exactly what that AI is, what it can do, what it can't do, and what happens to their data.
The nutrition label didn't kill the food industry. It made it more honest. The AI-BOM won't kill the AI industry. It will make it more trustworthy — and trustworthiness, in a world where 90% of people don't trust AI with their data, is the most valuable property a platform can have.
[1] European Commission. "EU AI Act." EUR-Lex, 2024.
[2] The White House. "Executive Order 14110 on Safe, Secure, and Trustworthy AI." 2023.
[3] Mitchell, M. et al. "Model Cards for Model Reporting." FAT* 2019.
[4] Gebru, T. et al. "Datasheets for Datasets." Communications of the ACM, 2021.
[5] Meta AI. "Llama 3.2 Model Card." llama.meta.com, 2024.
[6] OpenAI. "GPT-4 Technical Report." arXiv:2303.08774, 2023.
[7] FDA. "Nutrition Labeling of Food." 21 CFR Part 101, 1990.
[8] NHTSA. "New Car Assessment Program." nhtsa.gov, 2024.
[9] Amundson, A.L. "Anti-Persuasion Architecture." BlackRoad OS Technical Report, 2026.
[10] Amundson, A.L. "Sovereign Edge Infrastructure." BlackRoad OS Technical Report, 2026.
Part of BlackRoad OS — sovereign AI on your hardware.