project schema documentation
on this page
overview
this document provides comprehensive documentation for the data center project json schema used throughout the database. each project represents a single data center facility or campus documented with standardized fields for location, scale, timeline, participants, and sources.
schema versions
- current version: 1.0 (october 2025)
- file locations:
/support/datacenters/data/{state}/projects.json
- total projects: 604 across 50 states
top-level structure
each state file contains:
{
"state": "string",
"lastUpdated": "YYYY-MM-DD",
"projects": [
// array of project objects
]
}
project object schema
core identification fields
Field | Type | Required | Description |
projectName | string | Yes | Unique identifier for the project. Format: “[Company] [Location] [Type]” or official project name if available. |
status | enum | Yes | Current project status. Values: “operational”, “under-construction”, “planned”, “announced”, “cancelled”, “delayed”, “on-hold” |
example:
{
"projectName": "AWS Louisa County Campus 1",
"status": "under-construction"
}
location object
Field | Type | Required | Description |
location.city | string | Yes | City where facility is located. Use “Unknown” if not disclosed. |
location.county | string | Yes | County/parish where facility is located. Use “Unknown” if not disclosed. |
location.region | string | No | Geographic region within state (e.g., “Northern Virginia”, “Central Texas”) |
example:
{
"location": {
"city": "Louisa",
"county": "Louisa County",
"region": "Central Virginia"
}
}
participants
Field | Type | Required | Description |
sponsors | array | Yes | Companies financing the project. Array of strings with official entity names. |
operators | array | Yes | Companies operating the facility. Array of strings. May overlap with sponsors. |
tenants | array | Yes | Companies leasing space in the facility. Empty array [] if owner-occupied. |
example:
{
"sponsors": ["Amazon Web Services"],
"operators": ["Amazon Web Services"],
"tenants": []
}
example with separation:
{
"sponsors": ["OpenAI", "Oracle", "SoftBank"],
"operators": ["Oracle", "Crusoe Energy"],
"tenants": ["OpenAI"]
}
size object
Field | Type | Required | Description |
size.investmentUSD | number | No | Total investment in US dollars. Use conservative estimate if range provided. |
size.powerCapacityMW | number | No | Total power capacity in megawatts. Critical IT load, not total utility capacity. |
size.totalSquareFeet | number | No | Total building square footage. Use gross area including support spaces. |
size.notes | string | No | Additional context about sizing, phasing, or caveats |
example:
{
"size": {
"investmentUSD": 11000000000,
"powerCapacityMW": 1200,
"totalSquareFeet": 4000000,
"notes": "Part of AWS's $35 billion Virginia investment by 2040"
}
}
validation rules:
- if
powerCapacityMW
> 1000, verify source documentation - if
investmentUSD
> 10 billion, verify multiple sources - use conservative estimates when ranges provided
timeline object
Field | Type | Required | Description |
timeline.announcedDate | string | No | Date project publicly announced. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY” |
timeline.constructionStartDate | string | No | Date construction began. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY” |
timeline.expectedCompletionDate | string | No | Expected completion date for planned/construction projects. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY-QN” |
timeline.actualCompletionDate | string | No | Actual completion date for operational facilities. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY” |
timeline.phases | array | No | Array of phase objects for multi-phase projects |
timeline.notes | string | No | Additional timeline context, delays, or schedule changes |
phase object structure:
{
"phaseNumber": 1,
"description": "First 300 MW substation providing 150 MW",
"expectedCompletion": "2025-10"
}
example:
{
"timeline": {
"announcedDate": "2025-01",
"constructionStartDate": "2024-06",
"expectedCompletionDate": "2026-06",
"phases": [
{
"phaseNumber": 1,
"description": "2 buildings, 200+ MW",
"expectedCompletion": "2025-H1"
},
{
"phaseNumber": 2,
"description": "6 buildings, 1000 MW",
"expectedCompletion": "2026-06"
}
]
}
}
purpose array
Field | Type | Required | Description |
purpose | array | Yes | Array of purpose strings describing facility use cases |
allowed values:
"hyperscale"
- large-scale facilities (typically >100 MW)"colocation"
- multi-tenant retail colocation"cloud"
- public cloud infrastructure"ai-ml"
- ai/machine learning workloads"enterprise"
- single-tenant enterprise"edge"
- edge computing/cdn"government"
- government/defense"wholesale"
- wholesale data center space
example:
{
"purpose": ["hyperscale", "ai-ml", "cloud"]
}
sustainability object
Field | Type | Required | Description |
sustainability.renewableEnergy | boolean | No | Whether facility commits to renewable energy (true/false) |
sustainability.waterCooling | boolean | No | Whether facility uses water-based cooling (true/false) |
sustainability.certifications | array | No | Array of sustainability certifications (e.g., “LEED Gold”, “Energy Star”) |
sustainability.notes | string | No | Additional sustainability information |
example:
{
"sustainability": {
"renewableEnergy": true,
"waterCooling": false,
"certifications": ["LEED Gold"],
"notes": "Runs on 100% renewable energy, enables 700 MW of solar projects across Virginia"
}
}
sources array
Field | Type | Required | Description |
sources | array | Yes | Array of source objects documenting project information |
source object structure:
Field | Type | Required | Description |
url | string | Yes | Full URL to source document |
title | string | Yes | Article/document title |
date | string | No | Publication date (YYYY-MM-DD format) |
publisher | string | Yes | Publishing organization |
example:
{
"sources": [
{
"url": "https://www.datacenterfrontier.com/hyperscale/article/33010712/aws-plans-11b-investment",
"title": "AWS Plans $11B Investment For 2 Data Center Campuses In Louisa County, VA by 2040",
"date": "2024-03-15",
"publisher": "Data Center Frontier"
},
{
"url": "https://www.bisnow.com/national/news/data-center/amazon-continues-push-120569",
"title": "Why Amazon Is Investing $11B In A Small, Rural Virginia County",
"publisher": "Bisnow"
}
]
}
validation rules:
- minimum 1 source required per project
- prefer 2+ sources for major claims
- use official sources (company press releases, sec filings) when available
- include date for all news articles
notes field
Field | Type | Required | Description |
notes | string | No | Free-form notes providing additional context, analysis, or important caveats |
example:
{
"notes": "AWS is expanding beyond Northern Virginia into rural counties"
}
complete example
{
"projectName": "Stargate Project - Abilene Campus (Oracle/Crusoe)",
"location": {
"city": "Abilene",
"county": "Taylor County",
"region": "West Texas"
},
"status": "operational",
"sponsors": ["OpenAI", "Oracle", "SoftBank", "Crusoe Energy", "Lancium"],
"operators": ["Oracle", "Crusoe Energy"],
"tenants": ["OpenAI"],
"size": {
"totalSquareFeet": 4000000,
"powerCapacityMW": 1200,
"investmentUSD": 40000000000,
"notes": "Part of $500 billion Stargate program; 8 buildings at full buildout"
},
"timeline": {
"announcedDate": "2025-01",
"constructionStartDate": "2024-06",
"expectedCompletionDate": "2026-06",
"phases": [
{
"phaseNumber": 1,
"description": "2 buildings, 200+ MW",
"expectedCompletion": "2025-H1"
},
{
"phaseNumber": 2,
"description": "6 buildings, 1000 MW",
"expectedCompletion": "2026-06"
}
]
},
"purpose": ["hyperscale", "ai-ml", "cloud"],
"sustainability": {
"renewableEnergy": true,
"waterCooling": true,
"certifications": [],
"notes": "Zero-water evaporation cooling system; natural gas turbines for backup"
},
"sources": [
{
"url": "https://www.cnbc.com/2025/09/23/openai-first-data-center-stargate-project-texas.html",
"title": "OpenAI's first data center in $500 billion Stargate project is open in Texas",
"date": "2025-09-23",
"publisher": "CNBC"
},
{
"url": "https://www.datacenterdynamics.com/en/news/oracle-40bn-nvidia-chips-openai-texas/",
"title": "Oracle to spend $40bn on Nvidia GPUs for OpenAI Texas data center",
"publisher": "Data Center Dynamics"
}
],
"notes": "First operational Stargate facility; Oracle 15-year lease; designed for 100,000 GPUs on single network fabric"
}
field relationships
status dependencies
Status | Required Fields | Expected Fields |
operational | location, sponsors, operators | actualCompletionDate, size metrics |
under-construction | location, sponsors, constructionStartDate | expectedCompletionDate, size metrics |
planned | location, sponsors | announcedDate, investmentUSD or powerCapacityMW |
announced | location, sponsors, announcedDate | investmentUSD or powerCapacityMW |
purpose combinations
common combinations:
["hyperscale", "cloud"]
- hyperscaler-owned cloud infrastructure["hyperscale", "ai-ml"]
- ai-focused hyperscale facilities["colocation", "enterprise"]
- multi-tenant enterprise colocation["colocation", "cloud", "interconnection"]
- carrier-neutral interconnection hubs
validation checklist
when adding new projects, verify:
- unique
projectName
within state - valid
status
enum value -
location.city
andlocation.county
specified - at least one
sponsor
and oneoperator
- at least one
purpose
value - at least one
source
with url and publisher - date formats consistent (YYYY-MM-DD preferred)
- investment values in full USD (not abbreviated)
- power capacity represents IT load (not total utility)
- notes field used for important context
- multiple sources for extraordinary claims
common patterns
hyperscaler expansion
{
"projectName": "Microsoft Leesburg Campus",
"status": "under-construction",
"sponsors": ["Microsoft"],
"operators": ["Microsoft Azure"],
"tenants": [],
"purpose": ["hyperscale", "cloud"],
"size": {
"notes": "Part of Microsoft's $80B fiscal year investment"
}
}
colocation facility
{
"projectName": "Equinix DC12 Ashburn",
"status": "operational",
"sponsors": ["Equinix"],
"operators": ["Equinix"],
"tenants": [],
"purpose": ["colocation"],
"size": {
"totalSquareFeet": 41000
}
}
ai/ml specialized
{
"projectName": "CoreWeave LAS1",
"status": "operational",
"sponsors": ["CoreWeave"],
"operators": ["CoreWeave"],
"tenants": [],
"purpose": ["ai-ml", "cloud"],
"size": {
"powerCapacityMW": 14
},
"notes": "GPU-optimized for AI inference and training workloads"
}
schema evolution
version history
- v1.0 (october 2025): initial schema definition
- 604 projects documented
- standardized field naming
- comprehensive validation rules
planned enhancements
potential future additions:
contractors
array for construction firmspowerSource
object detailing utility/renewable/nuclearcoolingType
enum for cooling technologynetworkConnectivity
object for fiber/peeringgpuCount
for ai/ml facilitiesrackCount
andrackDensityKW
for capacity detail
for questions or schema change proposals, reference the main data center database documentation.