project schema documentation

published: October 16, 2025

overview

this document provides comprehensive documentation for the data center project json schema used throughout the database. each project represents a single data center facility or campus documented with standardized fields for location, scale, timeline, participants, and sources.

schema versions

  • current version: 1.0 (october 2025)
  • file locations: /support/datacenters/data/{state}/projects.json
  • total projects: 604 across 50 states

top-level structure

each state file contains:

{
  "state": "string",
  "lastUpdated": "YYYY-MM-DD",
  "projects": [
    // array of project objects
  ]
}

project object schema

core identification fields

FieldTypeRequiredDescription
projectNamestringYes

Unique identifier for the project. Format: “[Company] [Location] [Type]” or official project name if available.

statusenumYes

Current project status. Values: “operational”, “under-construction”, “planned”, “announced”, “cancelled”, “delayed”, “on-hold”

example:

{
  "projectName": "AWS Louisa County Campus 1",
  "status": "under-construction"
}

location object

FieldTypeRequiredDescription
location.citystringYesCity where facility is located. Use “Unknown” if not disclosed.
location.countystringYesCounty/parish where facility is located. Use “Unknown” if not disclosed.
location.regionstringNoGeographic region within state (e.g., “Northern Virginia”, “Central Texas”)

example:

{
  "location": {
    "city": "Louisa",
    "county": "Louisa County",
    "region": "Central Virginia"
  }
}

participants

FieldTypeRequiredDescription
sponsorsarrayYesCompanies financing the project. Array of strings with official entity names.
operatorsarrayYesCompanies operating the facility. Array of strings. May overlap with sponsors.
tenantsarrayYesCompanies leasing space in the facility. Empty array [] if owner-occupied.

example:

{
  "sponsors": ["Amazon Web Services"],
  "operators": ["Amazon Web Services"],
  "tenants": []
}

example with separation:

{
  "sponsors": ["OpenAI", "Oracle", "SoftBank"],
  "operators": ["Oracle", "Crusoe Energy"],
  "tenants": ["OpenAI"]
}

size object

FieldTypeRequiredDescription
size.investmentUSDnumberNoTotal investment in US dollars. Use conservative estimate if range provided.
size.powerCapacityMWnumberNoTotal power capacity in megawatts. Critical IT load, not total utility capacity.
size.totalSquareFeetnumberNoTotal building square footage. Use gross area including support spaces.
size.notesstringNoAdditional context about sizing, phasing, or caveats

example:

{
  "size": {
    "investmentUSD": 11000000000,
    "powerCapacityMW": 1200,
    "totalSquareFeet": 4000000,
    "notes": "Part of AWS's $35 billion Virginia investment by 2040"
  }
}

validation rules:

  • if powerCapacityMW > 1000, verify source documentation
  • if investmentUSD > 10 billion, verify multiple sources
  • use conservative estimates when ranges provided

timeline object

FieldTypeRequiredDescription
timeline.announcedDatestringNoDate project publicly announced. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY”
timeline.constructionStartDatestringNoDate construction began. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY”
timeline.expectedCompletionDatestringNo

Expected completion date for planned/construction projects. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY-QN”

timeline.actualCompletionDatestringNo

Actual completion date for operational facilities. Format: “YYYY-MM-DD” or “YYYY-MM” or “YYYY”

timeline.phasesarrayNoArray of phase objects for multi-phase projects
timeline.notesstringNoAdditional timeline context, delays, or schedule changes

phase object structure:

{
  "phaseNumber": 1,
  "description": "First 300 MW substation providing 150 MW",
  "expectedCompletion": "2025-10"
}

example:

{
  "timeline": {
    "announcedDate": "2025-01",
    "constructionStartDate": "2024-06",
    "expectedCompletionDate": "2026-06",
    "phases": [
      {
        "phaseNumber": 1,
        "description": "2 buildings, 200+ MW",
        "expectedCompletion": "2025-H1"
      },
      {
        "phaseNumber": 2,
        "description": "6 buildings, 1000 MW",
        "expectedCompletion": "2026-06"
      }
    ]
  }
}

purpose array

FieldTypeRequiredDescription
purposearrayYesArray of purpose strings describing facility use cases

allowed values:

  • "hyperscale" - large-scale facilities (typically >100 MW)
  • "colocation" - multi-tenant retail colocation
  • "cloud" - public cloud infrastructure
  • "ai-ml" - ai/machine learning workloads
  • "enterprise" - single-tenant enterprise
  • "edge" - edge computing/cdn
  • "government" - government/defense
  • "wholesale" - wholesale data center space

example:

{
  "purpose": ["hyperscale", "ai-ml", "cloud"]
}

sustainability object

FieldTypeRequiredDescription
sustainability.renewableEnergybooleanNoWhether facility commits to renewable energy (true/false)
sustainability.waterCoolingbooleanNoWhether facility uses water-based cooling (true/false)
sustainability.certificationsarrayNoArray of sustainability certifications (e.g., “LEED Gold”, “Energy Star”)
sustainability.notesstringNoAdditional sustainability information

example:

{
  "sustainability": {
    "renewableEnergy": true,
    "waterCooling": false,
    "certifications": ["LEED Gold"],
    "notes": "Runs on 100% renewable energy, enables 700 MW of solar projects across Virginia"
  }
}

sources array

FieldTypeRequiredDescription
sourcesarrayYesArray of source objects documenting project information

source object structure:

FieldTypeRequiredDescription
urlstringYesFull URL to source document
titlestringYesArticle/document title
datestringNoPublication date (YYYY-MM-DD format)
publisherstringYesPublishing organization

example:

{
  "sources": [
    {
      "url": "https://www.datacenterfrontier.com/hyperscale/article/33010712/aws-plans-11b-investment",
      "title": "AWS Plans $11B Investment For 2 Data Center Campuses In Louisa County, VA by 2040",
      "date": "2024-03-15",
      "publisher": "Data Center Frontier"
    },
    {
      "url": "https://www.bisnow.com/national/news/data-center/amazon-continues-push-120569",
      "title": "Why Amazon Is Investing $11B In A Small, Rural Virginia County",
      "publisher": "Bisnow"
    }
  ]
}

validation rules:

  • minimum 1 source required per project
  • prefer 2+ sources for major claims
  • use official sources (company press releases, sec filings) when available
  • include date for all news articles

notes field

FieldTypeRequiredDescription
notesstringNoFree-form notes providing additional context, analysis, or important caveats

example:

{
  "notes": "AWS is expanding beyond Northern Virginia into rural counties"
}

complete example

{
  "projectName": "Stargate Project - Abilene Campus (Oracle/Crusoe)",
  "location": {
    "city": "Abilene",
    "county": "Taylor County",
    "region": "West Texas"
  },
  "status": "operational",
  "sponsors": ["OpenAI", "Oracle", "SoftBank", "Crusoe Energy", "Lancium"],
  "operators": ["Oracle", "Crusoe Energy"],
  "tenants": ["OpenAI"],
  "size": {
    "totalSquareFeet": 4000000,
    "powerCapacityMW": 1200,
    "investmentUSD": 40000000000,
    "notes": "Part of $500 billion Stargate program; 8 buildings at full buildout"
  },
  "timeline": {
    "announcedDate": "2025-01",
    "constructionStartDate": "2024-06",
    "expectedCompletionDate": "2026-06",
    "phases": [
      {
        "phaseNumber": 1,
        "description": "2 buildings, 200+ MW",
        "expectedCompletion": "2025-H1"
      },
      {
        "phaseNumber": 2,
        "description": "6 buildings, 1000 MW",
        "expectedCompletion": "2026-06"
      }
    ]
  },
  "purpose": ["hyperscale", "ai-ml", "cloud"],
  "sustainability": {
    "renewableEnergy": true,
    "waterCooling": true,
    "certifications": [],
    "notes": "Zero-water evaporation cooling system; natural gas turbines for backup"
  },
  "sources": [
    {
      "url": "https://www.cnbc.com/2025/09/23/openai-first-data-center-stargate-project-texas.html",
      "title": "OpenAI's first data center in $500 billion Stargate project is open in Texas",
      "date": "2025-09-23",
      "publisher": "CNBC"
    },
    {
      "url": "https://www.datacenterdynamics.com/en/news/oracle-40bn-nvidia-chips-openai-texas/",
      "title": "Oracle to spend $40bn on Nvidia GPUs for OpenAI Texas data center",
      "publisher": "Data Center Dynamics"
    }
  ],
  "notes": "First operational Stargate facility; Oracle 15-year lease; designed for 100,000 GPUs on single network fabric"
}

field relationships

status dependencies

StatusRequired FieldsExpected Fields
operationallocation, sponsors, operatorsactualCompletionDate, size metrics
under-constructionlocation, sponsors, constructionStartDateexpectedCompletionDate, size metrics
plannedlocation, sponsorsannouncedDate, investmentUSD or powerCapacityMW
announcedlocation, sponsors, announcedDateinvestmentUSD or powerCapacityMW

purpose combinations

common combinations:

  • ["hyperscale", "cloud"] - hyperscaler-owned cloud infrastructure
  • ["hyperscale", "ai-ml"] - ai-focused hyperscale facilities
  • ["colocation", "enterprise"] - multi-tenant enterprise colocation
  • ["colocation", "cloud", "interconnection"] - carrier-neutral interconnection hubs

validation checklist

when adding new projects, verify:

  • unique projectName within state
  • valid status enum value
  • location.city and location.county specified
  • at least one sponsor and one operator
  • at least one purpose value
  • at least one source with url and publisher
  • date formats consistent (YYYY-MM-DD preferred)
  • investment values in full USD (not abbreviated)
  • power capacity represents IT load (not total utility)
  • notes field used for important context
  • multiple sources for extraordinary claims

common patterns

hyperscaler expansion

{
  "projectName": "Microsoft Leesburg Campus",
  "status": "under-construction",
  "sponsors": ["Microsoft"],
  "operators": ["Microsoft Azure"],
  "tenants": [],
  "purpose": ["hyperscale", "cloud"],
  "size": {
    "notes": "Part of Microsoft's $80B fiscal year investment"
  }
}

colocation facility

{
  "projectName": "Equinix DC12 Ashburn",
  "status": "operational",
  "sponsors": ["Equinix"],
  "operators": ["Equinix"],
  "tenants": [],
  "purpose": ["colocation"],
  "size": {
    "totalSquareFeet": 41000
  }
}

ai/ml specialized

{
  "projectName": "CoreWeave LAS1",
  "status": "operational",
  "sponsors": ["CoreWeave"],
  "operators": ["CoreWeave"],
  "tenants": [],
  "purpose": ["ai-ml", "cloud"],
  "size": {
    "powerCapacityMW": 14
  },
  "notes": "GPU-optimized for AI inference and training workloads"
}

schema evolution

version history

  • v1.0 (october 2025): initial schema definition
    • 604 projects documented
    • standardized field naming
    • comprehensive validation rules

planned enhancements

potential future additions:

  • contractors array for construction firms
  • powerSource object detailing utility/renewable/nuclear
  • coolingType enum for cooling technology
  • networkConnectivity object for fiber/peering
  • gpuCount for ai/ml facilities
  • rackCount and rackDensityKW for capacity detail

for questions or schema change proposals, reference the main data center database documentation.

on this page