project schema documentation
on this page
overview
this document provides comprehensive documentation for the data center project json schema used throughout the database. each project represents a single data center facility or campus documented with standardized fields for location, scale, timeline, participants, and sources.
schema versions
- current version: 1.0 (october 2025)
- file locations:
/support/datacenters/data/{state}/projects.json - total projects: 604 across 50 states
top-level structure
each state file contains:
{
"state": "string",
"lastUpdated": "YYYY-MM-DD",
"projects": [
// array of project objects
]
} project object schema
core identification fields
| Field | Type | Required | Description |
| projectName | string | Yes | Unique identifier for the project. Format: “[Company] [Location] [Type]” or official project name if available. |
| status | enum | Yes | Current project status. Values: “operational”, “under-construction”, “planned”, “announced”, “cancelled”, “delayed”, “on-hold” |
example:
{
"projectName": "AWS Louisa County Campus 1",
"status": "under-construction"
} location object
| Field | Type | Required | Description |
| location.city | string | Yes | City where facility is located. Use “Unknown” if not disclosed. |
| location.county | string | Yes | County/parish where facility is located. Use “Unknown” if not disclosed. |
| location.region | string | No | Geographic region within state (e.g., “Northern Virginia”, “Central Texas”) |
example:
{
"location": {
"city": "Louisa",
"county": "Louisa County",
"region": "Central Virginia"
}
} participants
| Field | Type | Required | Description |
| sponsors | array | Yes | Companies financing the project. Array of strings with official entity names. |
| operators | array | Yes | Companies operating the facility. Array of strings. May overlap with sponsors. |
| tenants | array | Yes | Companies leasing space in the facility. Empty array [] if owner-occupied. |
example:
{
"sponsors": ["Amazon Web Services"],
"operators": ["Amazon Web Services"],
"tenants": []
} example with separation:
{
"sponsors": ["OpenAI", "Oracle", "SoftBank"],
"operators": ["Oracle", "Crusoe Energy"],
"tenants": ["OpenAI"]
} size object
| Field | Type | Required | Description |
| size.investmentUSD | number | No | Total investment in US dollars. Use conservative estimate if range provided. |
| size.powerCapacityMW | number | No | Total power capacity in megawatts. Critical IT load, not total utility capacity. |
| size.totalSquareFeet | number | No | Total building square footage. Use gross area including support spaces. |
| size.notes | string | No | Additional context about sizing, phasing, or caveats |
example:
{
"size": {
"investmentUSD": 11000000000,
"powerCapacityMW": 1200,
"totalSquareFeet": 4000000,
"notes": "Part of AWS's $35 billion Virginia investment by 2040"
}
} validation rules:
- if
powerCapacityMW> 1000, verify source documentation - if
investmentUSD> 10 billion, verify multiple sources - use conservative estimates when ranges provided
timeline object
| Field | Type | Required | Description |
| timeline.announcedDate | string | No | Date project publicly announced. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY” |
| timeline.constructionStartDate | string | No | Date construction began. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY” |
| timeline.expectedCompletionDate | string | No | Expected completion date for planned/construction projects. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY-QN” |
| timeline.actualCompletionDate | string | No | Actual completion date for operational facilities. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY” |
| timeline.phases | array | No | Array of phase objects for multi-phase projects |
| timeline.notes | string | No | Additional timeline context, delays, or schedule changes |
phase object structure:
{
"phaseNumber": 1,
"description": "First 300 MW substation providing 150 MW",
"expectedCompletion": "2025-10"
} example:
{
"timeline": {
"announcedDate": "2025-01",
"constructionStartDate": "2024-06",
"expectedCompletionDate": "2026-06",
"phases": [
{
"phaseNumber": 1,
"description": "2 buildings, 200+ MW",
"expectedCompletion": "2025-H1"
},
{
"phaseNumber": 2,
"description": "6 buildings, 1000 MW",
"expectedCompletion": "2026-06"
}
]
}
} purpose array
| Field | Type | Required | Description |
| purpose | array | Yes | Array of purpose strings describing facility use cases |
allowed values:
"hyperscale"- large-scale facilities (typically >100 MW)"colocation"- multi-tenant retail colocation"cloud"- public cloud infrastructure"ai-ml"- ai/machine learning workloads"enterprise"- single-tenant enterprise"edge"- edge computing/cdn"government"- government/defense"wholesale"- wholesale data center space
example:
{
"purpose": ["hyperscale", "ai-ml", "cloud"]
} sustainability object
| Field | Type | Required | Description |
| sustainability.renewableEnergy | boolean | No | Whether facility commits to renewable energy (true/false) |
| sustainability.waterCooling | boolean | No | Whether facility uses water-based cooling (true/false) |
| sustainability.certifications | array | No | Array of sustainability certifications (e.g., “LEED Gold”, “Energy Star”) |
| sustainability.notes | string | No | Additional sustainability information |
example:
{
"sustainability": {
"renewableEnergy": true,
"waterCooling": false,
"certifications": ["LEED Gold"],
"notes": "Runs on 100% renewable energy, enables 700 MW of solar projects across Virginia"
}
} sources array
| Field | Type | Required | Description |
| sources | array | Yes | Array of source objects documenting project information |
source object structure:
| Field | Type | Required | Description |
| url | string | Yes | Full URL to source document |
| title | string | Yes | Article/document title |
| date | string | No | Publication date (YYYY-MM-DD format) |
| publisher | string | Yes | Publishing organization |
example:
{
"sources": [
{
"url": "https://www.datacenterfrontier.com/hyperscale/article/33010712/aws-plans-11b-investment",
"title": "AWS Plans $11B Investment For 2 Data Center Campuses In Louisa County, VA by 2040",
"date": "2024-03-15",
"publisher": "Data Center Frontier"
},
{
"url": "https://www.bisnow.com/national/news/data-center/amazon-continues-push-120569",
"title": "Why Amazon Is Investing $11B In A Small, Rural Virginia County",
"publisher": "Bisnow"
}
]
} validation rules:
- minimum 1 source required per project
- prefer 2+ sources for major claims
- use official sources (company press releases, sec filings) when available
- include date for all news articles
notes field
| Field | Type | Required | Description |
| notes | string | No | Free-form notes providing additional context, analysis, or important caveats |
example:
{
"notes": "AWS is expanding beyond Northern Virginia into rural counties"
} complete example
{
"projectName": "Stargate Project - Abilene Campus (Oracle/Crusoe)",
"location": {
"city": "Abilene",
"county": "Taylor County",
"region": "West Texas"
},
"status": "operational",
"sponsors": ["OpenAI", "Oracle", "SoftBank", "Crusoe Energy", "Lancium"],
"operators": ["Oracle", "Crusoe Energy"],
"tenants": ["OpenAI"],
"size": {
"totalSquareFeet": 4000000,
"powerCapacityMW": 1200,
"investmentUSD": 40000000000,
"notes": "Part of $500 billion Stargate program; 8 buildings at full buildout"
},
"timeline": {
"announcedDate": "2025-01",
"constructionStartDate": "2024-06",
"expectedCompletionDate": "2026-06",
"phases": [
{
"phaseNumber": 1,
"description": "2 buildings, 200+ MW",
"expectedCompletion": "2025-H1"
},
{
"phaseNumber": 2,
"description": "6 buildings, 1000 MW",
"expectedCompletion": "2026-06"
}
]
},
"purpose": ["hyperscale", "ai-ml", "cloud"],
"sustainability": {
"renewableEnergy": true,
"waterCooling": true,
"certifications": [],
"notes": "Zero-water evaporation cooling system; natural gas turbines for backup"
},
"sources": [
{
"url": "https://www.cnbc.com/2025/09/23/openai-first-data-center-stargate-project-texas.html",
"title": "OpenAI's first data center in $500 billion Stargate project is open in Texas",
"date": "2025-09-23",
"publisher": "CNBC"
},
{
"url": "https://www.datacenterdynamics.com/en/news/oracle-40bn-nvidia-chips-openai-texas/",
"title": "Oracle to spend $40bn on Nvidia GPUs for OpenAI Texas data center",
"publisher": "Data Center Dynamics"
}
],
"notes": "First operational Stargate facility; Oracle 15-year lease; designed for 100,000 GPUs on single network fabric"
} field relationships
status dependencies
| Status | Required Fields | Expected Fields |
| operational | location, sponsors, operators | actualCompletionDate, size metrics |
| under-construction | location, sponsors, constructionStartDate | expectedCompletionDate, size metrics |
| planned | location, sponsors | announcedDate, investmentUSD or powerCapacityMW |
| announced | location, sponsors, announcedDate | investmentUSD or powerCapacityMW |
purpose combinations
common combinations:
["hyperscale", "cloud"]- hyperscaler-owned cloud infrastructure["hyperscale", "ai-ml"]- ai-focused hyperscale facilities["colocation", "enterprise"]- multi-tenant enterprise colocation["colocation", "cloud", "interconnection"]- carrier-neutral interconnection hubs
validation checklist
when adding new projects, verify:
- unique
projectNamewithin state - valid
statusenum value -
location.cityandlocation.countyspecified - at least one
sponsorand oneoperator - at least one
purposevalue - at least one
sourcewith url and publisher - date formats consistent (YYYY-MM-DD preferred)
- investment values in full USD (not abbreviated)
- power capacity represents IT load (not total utility)
- notes field used for important context
- multiple sources for extraordinary claims
common patterns
hyperscaler expansion
{
"projectName": "Microsoft Leesburg Campus",
"status": "under-construction",
"sponsors": ["Microsoft"],
"operators": ["Microsoft Azure"],
"tenants": [],
"purpose": ["hyperscale", "cloud"],
"size": {
"notes": "Part of Microsoft's $80B fiscal year investment"
}
} colocation facility
{
"projectName": "Equinix DC12 Ashburn",
"status": "operational",
"sponsors": ["Equinix"],
"operators": ["Equinix"],
"tenants": [],
"purpose": ["colocation"],
"size": {
"totalSquareFeet": 41000
}
} ai/ml specialized
{
"projectName": "CoreWeave LAS1",
"status": "operational",
"sponsors": ["CoreWeave"],
"operators": ["CoreWeave"],
"tenants": [],
"purpose": ["ai-ml", "cloud"],
"size": {
"powerCapacityMW": 14
},
"notes": "GPU-optimized for AI inference and training workloads"
} schema evolution
version history
- v1.0 (october 2025): initial schema definition
- 604 projects documented
- standardized field naming
- comprehensive validation rules
planned enhancements
potential future additions:
contractorsarray for construction firmspowerSourceobject detailing utility/renewable/nuclearcoolingTypeenum for cooling technologynetworkConnectivityobject for fiber/peeringgpuCountfor ai/ml facilitiesrackCountandrackDensityKWfor capacity detail
for questions or schema change proposals, reference the main data center database documentation.