entity dossier schema documentation
on this page
overview
this document provides comprehensive documentation for the entity dossier json schema used to profile companies and organizations participating in the data center industry. entity dossiers provide deep profiles of hyperscalers, operators, financial sponsors, utilities, construction firms, and technology vendors.
schema versions
- current version: 1.0 (october 2025)
- file locations:
/support/datacenters/entities/{category}/{entity_name}.json
- total entities: 37 profiled entities
- schema definition:
/support/datacenters/entities/ENTITY_DOSSIER_SCHEMA.json
entity types
Entity Type | Description | Count |
hyperscaler | Large cloud providers operating own infrastructure (AWS, Microsoft, Google, Meta, Oracle, OpenAI, xAI) | 7 |
operator | Data center operators providing colocation, wholesale, or managed services | 15 |
financial-sponsor | Private equity, sovereign wealth, infrastructure funds investing in data centers | 10 |
utility | Electric utilities serving data center power demands | 3 |
construction | General contractors specializing in data center construction | 1 |
technology-vendor | Technology providers (networking, servers, cooling, power) | 1 |
top-level structure
required fields
Field | Type | Description |
entityType | enum | Primary category: “hyperscaler”, “operator”, “financial-sponsor”, “utility”, “construction”, “technology-vendor” |
entityName | string | Official company/entity name |
overview | object | Basic company information |
sources | array | Array of source objects documenting dossier information |
lastUpdated | string | Date of last dossier update (YYYY-MM-DD format) |
optional fields
Field | Type | Description |
aliases | array | Alternative names, abbreviations, former names |
dataCenterProfile | object | Data center-specific operations and portfolio information |
strategy | object | Corporate strategy, growth plans, power strategy |
financials | object | Financial metrics and performance |
ownership | object | Ownership structure and acquisition history |
partnerships | object | Key partnerships across power, construction, technology, finance |
keyExecutives | array | Executive leadership profiles |
competitiveAnalysis | object | Market position, SWOT analysis, competitive differentiation |
timeline | array | Chronological history of major events |
projectReferences | array | References to projects in main database |
mediaPresence | object | Recent news and social media links |
notes | string | Additional context, analysis, or observations |
overview object
Field | Type | Description |
overview.founded | string | Founding year or date (YYYY or YYYY-MM-DD) |
overview.headquarters | object | HQ location with city, state, country |
overview.publicCompany | boolean | Whether company is publicly traded |
overview.ticker | string | Stock ticker symbol |
overview.exchange | string | Stock exchange (e.g., “NASDAQ”, “NYSE”) |
overview.marketCapUSD | number | Market capitalization in USD |
overview.employeesTotal | number | Total employee count |
overview.website | string | Official company website URL |
overview.description | string | 2-3 paragraph company description |
overview.businessModel | string | Detailed business model explanation |
example:
{
"overview": {
"founded": "1975",
"headquarters": {
"city": "Redmond",
"state": "Washington",
"country": "United States"
},
"publicCompany": true,
"ticker": "MSFT",
"exchange": "NASDAQ",
"marketCapUSD": 3800000000000,
"employeesTotal": 228000,
"website": "https://www.microsoft.com",
"description": "Microsoft Corporation is a multinational technology company...",
"businessModel": "Microsoft operates a diversified business model..."
}
}
data center profile object
global footprint
Field | Type | Description |
dataCenterProfile.globalFootprint.totalDataCenters | number | Total number of data centers globally |
dataCenterProfile.globalFootprint.totalCapacityMW | number | Total global power capacity in MW |
dataCenterProfile.globalFootprint.totalSquareFeet | number | Total global square footage |
dataCenterProfile.globalFootprint.countries | number | Number of countries with presence |
dataCenterProfile.globalFootprint.regions | array | Array of geographic regions |
us footprint
Field | Type | Description |
dataCenterProfile.usFootprint.projectsInDatabase | number | Number of projects in this database |
dataCenterProfile.usFootprint.states | array | Array of US states with presence |
dataCenterProfile.usFootprint.totalInvestmentUSD | number | Total US investment disclosed |
dataCenterProfile.usFootprint.totalCapacityMW | number | Total US power capacity |
dataCenterProfile.usFootprint.majorLocations | array | Array of major location objects |
major location object:
{
"location": "Northern Virginia",
"capacityMW": 150,
"status": "Operational",
"significance": "Microsoft's largest US data center region"
}
specialization
Field | Type | Description |
dataCenterProfile.specialization.primaryFocus | array | Array of focus areas: “hyperscale”, “colocation”, “enterprise”, “edge”, “ai-ml”, “cloud”, “wholesale”, “retail” |
dataCenterProfile.specialization.targetCustomers | array | Array of target customer types/segments |
dataCenterProfile.specialization.differentiators | array | Array of competitive differentiators |
dataCenterProfile.specialization.technologyFocus | array | Array of technology focus areas |
strategy object
Field | Type | Description |
strategy.corporateStrategy | string | Overall corporate strategy narrative |
strategy.growthStrategy | string | Growth and expansion strategy |
strategy.powerStrategy | object | Power sourcing and sustainability strategy |
strategy.geographicStrategy | string | Geographic expansion priorities |
strategy.mAndAStrategy | string | Mergers and acquisitions approach |
strategy.publicCommitments | array | Array of public commitment objects |
power strategy object
Field | Type | Description |
strategy.powerStrategy.approach | string | Overall power sourcing approach |
strategy.powerStrategy.renewableCommitment | string | Renewable energy commitments and targets |
strategy.powerStrategy.nuclearPartnerships | array | Array of nuclear partnership descriptions |
strategy.powerStrategy.gridStrategy | string | Grid partnership and utility strategy |
public commitment object
{
"announcement": "Fiscal Year 2025 AI Data Center Investment",
"date": "2025-01-03",
"valueUSD": 80000000000,
"scope": "AI-enabled data center construction globally",
"timeline": "Through June 30, 2025"
}
financials object
Field | Type | Description |
financials.fiscalYear | number | Fiscal year for reported metrics |
financials.revenueUSD | number | Annual revenue in USD |
financials.ebitdaUSD | number | EBITDA in USD |
financials.netIncomeUSD | number | Net income in USD |
financials.totalAssetsUSD | number | Total assets in USD |
financials.totalDebtUSD | number | Total debt in USD |
financials.capitalExpenditureUSD | number | Annual capex in USD |
financials.dataCenterSpecific | object | Data center-specific financial metrics |
financials.growthMetrics | object | Year-over-year growth rates |
partnerships object
power providers
Field | Type | Description |
partnerships.powerProviders | array | Array of power partnership objects |
power partnership object:
{
"partner": "Constellation Energy",
"type": "nuclear",
"capacityMW": 835,
"details": "20-year power purchase agreement for restart of Three Mile Island Unit 1",
"announcementDate": "2024-09-20"
}
type enum: “nuclear”, “renewable”, “utility”, “microgrid”
construction partners
{
"contractor": "DPR Construction",
"relationship": "Preferred contractor for major U.S. projects",
"projects": ["Equinix DA11 Dallas", "Equinix Ashburn North Campus"]
}
technology vendors
{
"vendor": "NVIDIA",
"category": "AI Hardware / GPUs",
"details": "Primary GPU supplier for Azure AI infrastructure"
}
financial partnerships
{
"partner": "Brookfield Asset Management",
"type": "Renewable Energy Investment Partnership",
"valueUSD": 10000000000,
"details": "Joint investment in 10.5 GW renewable energy capacity"
}
key executives array
Field | Type | Description |
keyExecutives[].name | string | Executive full name |
keyExecutives[].title | string | Current title |
keyExecutives[].role | string | Role description (CEO, CFO, CTO, etc.) |
keyExecutives[].startDate | string | Date started in role (YYYY-MM-DD) |
keyExecutives[].background | string | Professional background narrative |
keyExecutives[].linkedin | string | LinkedIn profile URL |
keyExecutives[].previousRoles | array | Array of previous position descriptions |
keyExecutives[].education | string | Educational background |
keyExecutives[].significance | string | Why this executive is significant |
competitive analysis object
Field | Type | Description |
competitiveAnalysis.marketPosition | string | Overall market position narrative |
competitiveAnalysis.marketSharePercent | number | Market share percentage |
competitiveAnalysis.ranking | object | Rankings by different metrics |
competitiveAnalysis.strengths | array | Array of competitive strengths |
competitiveAnalysis.weaknesses | array | Array of competitive weaknesses |
competitiveAnalysis.opportunities | array | Array of market opportunities |
competitiveAnalysis.threats | array | Array of competitive threats |
competitiveAnalysis.competitiveDifferentiators | array | Array of key differentiators |
competitiveAnalysis.directCompetitors | array | Array of direct competitor names |
competitiveAnalysis.competitiveAdvantages | array | Array of sustainable advantages |
timeline array
timeline event object:
Field | Type | Description |
timeline[].date | string | Event date (YYYY-MM-DD or YYYY-MM or YYYY) |
timeline[].event | string | Event description |
timeline[].category | enum | “founding”, “acquisition”, “expansion”, “partnership”, “financing”, “milestone”, “leadership-change”, “strategic-shift” |
timeline[].significance | string | Why event is significant |
timeline[].impactUSD | number | Financial impact in USD (optional) |
example:
{
"date": "2024-10",
"event": "$15B xScale Joint Venture with GIC and CPP Investments",
"category": "partnership",
"significance": "Nearly triples xScale program investment capital",
"impactUSD": 15000000000
}
project references array
links entity to projects in main database:
{
"projectName": "Equinix DC12 Ashburn",
"state": "Virginia",
"role": "operator",
"capacityMW": 4
}
role enum: “sponsor”, “operator”, “tenant”, “investor”, “contractor”, “customer”
sources array
same structure as project schema sources:
{
"url": "https://www.example.com/article",
"title": "Article Title",
"date": "2024-01-15",
"publisher": "Publisher Name",
"type": "sec-filing"
}
type enum: “company-website”, “sec-filing”, “press-release”, “news”, “industry-publication”, “analyst-report”, “conference-presentation”, “linkedin”, “wikipedia”
entity type specific patterns
hyperscaler dossier
emphasizes:
- global data center footprint
- cloud infrastructure strategy
- ai/ml investments
- power partnerships (especially nuclear)
- capex commitments
operator dossier
emphasizes:
- portfolio composition (retail vs wholesale)
- customer mix and enterprise relationships
- geographic footprint
- xscale/hyperscale capabilities
- interconnection ecosystem
financial sponsor dossier
emphasizes:
- portfolio companies
- total aum in data center sector
- recent deals and valuations
- exit strategies
- investment thesis
validation checklist
when creating entity dossiers:
-
entityType
matches directory location -
entityName
is official company name -
overview.description
provides comprehensive context -
dataCenterProfile
populated for relevant entity types -
powerStrategy
documented for major operators/hyperscalers - minimum 5 sources with diverse types
-
keyExecutives
includes C-level leadership -
timeline
captures major milestones -
competitiveAnalysis
provides balanced swot -
lastUpdated
reflects most recent information - all financial figures in full USD (not abbreviated)
- all dates in ISO 8601 format
-
projectReferences
links to actual database projects
for questions or schema change proposals, reference the main data center database documentation.