A Primary Research Primer on Data Centre and AI Infrastructure for Investment Professionals

A Primary Research Primer on Data Centre and AI Infrastructure for Investment Professionals

This primer covers how the industry is structured, where the money sits, how players across the value chain make their returns, who the relevant experts are, and the questions that produce genuine insight on an expert call.

The data centre and AI infrastructure sector is the physical layer on which the global AI economy runs. These are the facilities, power systems, cooling infrastructure, and compute hardware that make it possible for hyperscalers, AI labs, and enterprises to train models and serve inference at scale. For most of the last decade, data centres were a slow-moving, yield-oriented corner of real estate. The AI buildout has turned them into one of the most intensely watched infrastructure categories in global markets, combining capital intensity, geopolitical stakes, energy grid constraints, and a rate of technological change that makes the economics genuinely difficult to predict.

This primer covers how the industry is structured, where the money sits, how players across the value chain make their returns, who the relevant experts are, and the questions that produce genuine insight on an expert call. Woozle has run primary research programmes for long/short equity funds, specialist tech-focused pods, and PE deal teams across data centre operators, power infrastructure providers, and AI compute supply chains.


What Is Data Centre and AI Infrastructure?

A data centre is a facility built to house servers, networking equipment, and storage at high density, with guaranteed power, cooling, physical security, and network connectivity. The term covers everything from a 500-square-foot edge node to a multi-gigawatt hyperscale campus consuming as much electricity as a small city. AI infrastructure refers specifically to the compute-dense subset of this market: facilities and supply chains designed to support GPU clusters used for model training and inference.

Data centre equipment and infrastructure spending reached $290 billion in 2024, with Alphabet, Microsoft, Amazon, and Meta investing nearly $200 billion between them. The top three hyperscalers alone plan to invest more than $500 billion in capital expenditures for infrastructure supporting AI deployment in fiscal year 2026. These are numbers that reclassify data centres from a niche infrastructure sub-sector into a first-order macroeconomic force.

The industry breaks into four structural segments. Hyperscalers (AWS, Microsoft Azure, Google Cloud, Meta) build and operate their own facilities at a scale no one else can match. Colocation providers (Equinix, Digital Realty, Vantage, CyrusOne, Iron Mountain) own the facilities and lease power-plus-space to tenants who bring their own hardware. Neoclouds (CoreWeave, Lambda Labs, Nebius, Nscale) are a newer category: they acquire GPU hardware, typically Nvidia H100s, H200s, and GB200s, and rent compute capacity by the hour to AI developers and enterprises who cannot secure sufficient GPU supply from the hyperscalers. Power and cooling infrastructure suppliers (Vertiv, Schneider Electric, Eaton, ABB) sell the critical physical infrastructure that makes any of these facilities function.

The margin concentration within this stack is counterintuitive. The companies with the smallest public profile — the Vertivs and Schneider Electrics of the world — often sit in a more defensible position than the operators above them. Power distribution, cooling systems, and UPS infrastructure represent under 10% of total data centre cost but are the items with the longest lead times and highest switching costs. The colocation operators capturing hyperscale demand are generating EBITDA margins above 50% at the best-positioned players, funded by 10-to-15-year leases signed before a single rack goes live.


Why Are Investors Looking At This?

The structural driver is not subtle. Traditional data centres operate at 10-15 kW per rack. AI workloads demand 40-250 kW per rack. This is not an incremental upgrade to existing infrastructure. It is a full redesign of the facility layer, requiring new power architectures, liquid cooling systems, and site selection criteria centred on grid access rather than real estate convenience. Every Nvidia GB200 NVL72 rack draws approximately 120 kW. The incumbent infrastructure estate was built for a world where racks drew 5 kW. Retrofitting or replacing that estate is a decade-long capital programme, and investors who own the right assets in the right locations for the right duration stand to compound through it.

The return profile varies sharply by segment. Colocation REITs like Equinix and Digital Realty trade on adjusted EBITDA multiples and AFFO yields, reflecting the long-lease, asset-heavy nature of the business. Equinix sustains a 51% EBITDA margin on top of a network effects moat built around interconnection: its facilities host the physical meeting point between cloud providers, carriers, and enterprises, which creates a switching cost that pure wholesale players do not have. Neoclouds trade on revenue multiples and contracted backlog, with investors pricing in whether GPU commodity pricing erodes unit economics before the hyperscaler capex cycle slows. Power infrastructure suppliers trade on order backlogs and book-to-bill ratios, with Vertiv's $8.5 billion backlog and 1.2x book-to-bill serving as a frequently cited leading indicator for the broader sector.

The live debate is a genuine bull/bear standoff. Goldman Sachs's baseline model implies $765 billion in annual AI capex in 2026, growing to $1.6 trillion by 2031. Investors who are long argue that demand is structural, not cyclical: every new AI model generation consumes more compute, inference workloads scale with adoption, and lead times on power and cooling equipment mean supply cannot catch up for years. Those who are cautious point to customer concentration risk — CoreWeave generated 62% of its revenue from Microsoft at IPO — the possibility of model efficiency gains reducing compute intensity, and the precedent of the late 1990s telecom buildout as a reference for infrastructure euphoria ending badly. H100 rental rates have already declined 60-75% from their peak.

The more recent complication is on the supply side of supply: the power grid itself. High-voltage transformer lead times, which ran 24-30 months pre-2020, now stretch to five years. Projects that exist on paper and in press releases are not the same as projects that will come online on schedule. This is the variable repricing the sector in real time.

The unit economics of this sector look different depending on which layer of the stack you are analysing. The colocation REIT model and the neocloud model are not the same business, do not face the same risks, and should not trade at comparable multiples — yet the market has periodically conflated them. The expert map below identifies the job titles with actual line-of-sight on the metrics that drive each model, and the question bank draws out the answers that filings and earnings calls consistently do not give.


How the Industry Actually Works

The Business Model

Colocation operators generate revenue through a modified gross lease structure priced in dollars per kilowatt per month. The customer commits to a power allotment for a contract term. Electricity costs are passed through separately as a variable charge on top of the base rate. The US wholesale market averaged $195.94 per kW per month for 250-500 kW deployments in H2 2025, a 6.5% year-over-year increase, with larger deployments of 10 MW or more seeing rates rise by up to 19% as vacancy fell to a record-low 1.6%.

Contract structure varies by customer size. Retail colocation, covering 1-50 racks, runs month-to-month or on one-to-three year terms. Wholesale deployments, starting at 250 kW, run three to seven years. Hyperscale deployments at 4 MW and above are typically 10-to-15-year commitments with annual escalators of 2.5-5% built in. The escalator is frequently invisible in analyst models but compounds significantly over the lease term. The customer who locked in capacity at $120/kW-month in 2021 is paying less than half what a new entrant would pay in the same market today.

Neoclouds operate a fundamentally different model. They borrow to buy GPU hardware, lease facility space from colocation providers, and sell GPU-hours to AI developers. As of late 2025, H100 instances on CoreWeave were priced at approximately $2.49 per hour at the low end, significantly below Azure and GCP list pricing for equivalent configurations. The economics turn on utilisation: a GPU cluster at 80% utilisation is a good business. One running at 50% is not.

The Value Chain

The value chain runs from land and grid access through facility construction, power and cooling infrastructure, hardware procurement, and compute delivery to the end customer.

Land and power rights sit at the top of the value chain and are the scarcest input. Grid interconnection queues in the US PJM territory now average eight years from application to commercial operation. Sites with existing grid access and substation capacity command a structural premium that no amount of capital can easily replicate. Operators who control these sites locked in optionality years ago; developers announcing new campuses today are largely competing for the same constrained pool of energisable land.

Facility construction costs have risen sharply. JLL estimated US construction costs at $10.7 million per MW in 2025, up from roughly $7-8 million in 2022. The primary drivers are power and cooling infrastructure costs, not civil construction. Transformer lead times of three to five years mean developers who did not pre-purchase critical equipment in 2022 or 2023 are facing delivery schedules that push meaningful capacity well into 2027 and beyond.

Power and cooling suppliers occupy a structurally advantaged position in the chain. Vertiv posted 35% revenue growth in Q2 2025, with a backlog of $8.5 billion and a book-to-bill ratio of approximately 1.2x. Schneider Electric disclosed that data centres made up 24% of its incoming orders in 2025. These businesses carry 18-to-36-month delivery backlogs, meaning their revenue visibility is better than almost any other company in the supply chain.

The Competitive Dynamic

Scale and power access define the competitive pecking order in colocation. Operators with signed offtake agreements from investment-grade hyperscalers can finance construction at lower cost of capital than anyone else in the market. The hyperscalers' credit rating becomes, in effect, a funding mechanism for the colocation provider. Operators without that anchor tenancy are competing on price in a market where the best sites are already committed.

Within the neocloud segment, the competitive dynamic is less settled. The segment was built on a premise — that hyperscalers could not deliver sufficient GPU capacity to meet AI demand — that is weakening as hyperscaler buildouts accelerate. The question is whether neoclouds can develop software differentiation, proprietary interconnect performance, or specialist vertical positioning to sustain margins as commodity GPU pricing normalises.

Interconnection is Equinix's specific moat and is frequently misunderstood. Wholesale REITs are renting megawatts. Equinix is renting megawatts plus the network meeting room where tenants connect to clouds, carriers, and each other. That interconnection layer generates higher margins than raw colocation and creates switching costs that persist even when the tenant grows large enough to build their own facility elsewhere.


Unit Economics

Revenue is built on committed power capacity. A 100 MW colocation campus at $190/kW-month generates approximately $228 million in annual base revenue before electricity pass-through. Power costs are passed through directly and are margin-neutral; what the operator is selling is the reliability, redundancy, and connectivity of the facility itself. At 95%+ committed utilisation — the current market norm in tier-one US markets — that base revenue is highly visible.

Gross margins in this sector require careful definition. Equinix operates at approximately 51% adjusted EBITDA margins, reflecting the interconnection premium and the fact that electricity pass-through does not inflate the revenue base. Digital Realty, more wholesale-oriented with less interconnection revenue, runs closer to 40-45% adjusted EBITDA. Vantage, Applied Digital, and other build-to-suit specialists targeting pure hyperscale leases operate at lower margins still — they are essentially providing power-ready shell space rather than the managed, interconnected environment that commands premium pricing.

Capital intensity is the defining financial characteristic of this sector. Construction costs of $10-12 million per MW mean a 500 MW campus requires $5-6 billion of upfront capital before a single tenant pays rent. Returns on that capital depend on contract duration, pricing per kW, and cost of debt. At 15-year lease terms and investment-grade counterparty credit, the asset finances well. At five-year terms with AI startups as counterparties, lenders demand significantly more protection.

Cash flow generation is strong relative to reported earnings, because depreciation of long-lived assets (20-30 year economic lives) runs well above maintenance capex needs. This is why the sector uses AFFO as the primary cash return metric. Equinix's AFFO grew 12% in Q1 2026 to $1.065 billion on a revenue base of $2.44 billion.

The key financial KPIs that experienced investors track:

Power Usage Effectiveness (PUE). Total facility energy consumption divided by IT equipment energy consumption. Best-in-class AI facilities with liquid cooling run 1.02-1.15; industry average is closer to 1.5. Every 0.1 improvement in PUE on a 100 MW campus saves approximately $3-5 million in annual energy cost depending on local tariffs.

Book-to-bill ratio. New orders signed in the period divided by revenue delivered. Vertiv's 1.2x reading is the most widely watched leading indicator in the power infrastructure segment. A sustained reading above 1.0 signals growing backlog.

Contracted backlog. Total contracted but undelivered revenue. For colocation operators, this is the most reliable forward revenue indicator. Digital Realty's backlog, disclosed each quarter, is one of the first numbers sophisticated analysts check against the prior quarter.

$/kW-month lease rate. The headline pricing metric, tracked by CBRE and Cushman and Wakefield for primary markets each half-year. Investors use rate trends across markets to assess supply tightness and pricing power.

Committed utilisation. The percentage of available powered capacity under signed lease. Primary US markets are running at 98%+ committed in many sub-markets, which is what drives the rate increases above.

Annualised gross bookings. Used by Equinix to describe the annualised contract value of new leases signed in a period. Its record $378 million in Q1 2026 was the primary signal that AI inference demand is translating into real colocation demand, not just hyperscaler owned-and-operated additions.

Interconnection revenue share. The percentage of total revenue derived from cross-connects and network interconnection services. Higher is better from a margin perspective and signals network density. Equinix's share is structurally higher than any peer and is the single metric that most clearly separates its business model from wholesale competitors.


Key Players

The market splits cleanly into the four layers of the value chain, each with distinct competitive dynamics and financial profiles.

Hyperscalers — AWS, Microsoft Azure, Google Cloud, and Meta — dominate demand. They simultaneously build their own facilities, lease from colocation providers, and purchase GPU compute from neoclouds when their own infrastructure cannot keep pace with model training demand. Hyperscalers operated 1,360 large data centres globally by end of 2025, nearly triple the amount from 2018. Microsoft committed $80 billion in data centre capital in FY2025 alone.

Equinix is the largest colocation operator globally by revenue, generating $2.44 billion in Q1 2026 on 270 data centres across 36 countries. Its competitive position rests on the interconnection layer — the physical and virtual peering infrastructure that makes its facilities network hubs rather than mere hosting venues. Adjusted EBITDA reached a record 51% margin in Q1 2026, with AFFO growing 12%.

Digital Realty operates 300-plus facilities globally, generating $1.6 billion per quarter in revenue at approximately 40-45% EBITDA margins. It is more wholesale-oriented than Equinix, with a larger proportion of revenue from large-block hyperscale leases. Vantage Data Centers has built approximately 2.6 GW of global capacity through build-to-suit hyperscale campuses. Applied Digital signed a $7.5 billion Delta Forge 1 hyperscaler deal in April 2026. CyrusOne, taken private by KKR and Global Infrastructure Partners, focuses on wholesale enterprise and cloud-adjacent deployments.

CoreWeave is the most prominent neocloud globally, operating 43 data centres with approximately 250,000 GPUs, growing from $16 million in revenue in 2022 to $1.9 billion in 2024 before its March 2025 IPO. Lambda Labs, Nebius, and Nscale occupy the next tier, each targeting different customer profiles: Lambda targeting developers and academic research, Nebius targeting enterprise AI in Europe, and Nscale positioning around long-term hyperscaler offtake agreements.

Vertiv and Schneider Electric dominate the market for critical power and cooling infrastructure. Vertiv's 360AI platform supports rack densities to 100 kW with integrated busway, coolant distribution, and leak detection. Schneider Electric competes across UPS systems, modular data centre architecture, and power management software. Eaton and ABB compete in switchgear, UPS, and medium-voltage distribution. These businesses carry 18-to-36-month backlog coverage and operate at EBITDA margins in the 15-25% range — lower than the colocation operators but with more stable demand visibility and less capex intensity.

The least-covered segment but arguably the most structurally constrained sits in transformer and switchgear manufacturing. Substation transformer lead times stretched from roughly 140 weeks in 2023 to more than 160 weeks in 2026, with annual demand expected to rise from 1,500 units to more than 9,000 by 2030. Hitachi Energy, Eaton, ABB, TBEA, and China XD Group are the primary players in a market where being in the procurement queue early is the only competitive advantage that matters.


The Expert Map: Who Knows What

No single call covers this sector. The hyperscale buildout, the colocation lease market, the neocloud GPU economics, and the power infrastructure supply chain are four separate research programmes. Triangulating across them produces a materially more accurate picture than any individual expert can provide alone.

VP of Real Estate or VP of Site Acquisition at a hyperscaler. These individuals control the site selection function for hyperscaler campus development. They know which markets the company is prioritising, what power capacity is being actively pursued, and how far into the future the development pipeline extends. They do not know the financial model for compute economics or commercial cloud product terms. Best used early in thesis construction to understand where demand is heading geographically before it becomes public. Open with: "What changed about your site selection criteria in the last 18 months, and which markets got removed from your priority list?"

Head of Power Development or Energy Procurement Director at a hyperscaler or large colocation operator. The person responsible for securing grid capacity, negotiating power purchase agreements, and managing grid interconnection queues has direct line-of-sight on the single most constrained input in the buildout. They know which markets are actually deliverable on what timelines, as opposed to what gets announced. Best used to interrogate the credibility of announced delivery timelines. Open with: "Of the sites your company has announced in the last two years, how many have a signed utility agreement in place for the power needed?"

Data Centre General Manager or Campus Operations VP at a colocation operator. Site-level operators know actual utilisation, the real status of current build programmes, customer satisfaction and renewal intent, and the operational detail that never makes quarterly filings. Best used for competitive intelligence on specific markets. Open with: "Which competitor did you last lose a meaningful renewal to, and why?"

Director of Infrastructure or Head of AI Compute at an AI lab or large enterprise. These individuals make or influence the buy decision for GPU compute capacity. They know what their capacity pipeline looks like, which providers are performing on SLAs, and where they are likely to commit new spend. Best used to validate or challenge the demand assumptions embedded in neocloud and colocation operator bull cases. Open with: "What constraint is actually limiting how fast you can scale your compute right now?"

VP of Sales or Enterprise Account Director at a neocloud. Sales leaders at CoreWeave, Nebius, or Lambda Labs have direct visibility into pipeline, close rates, customer negotiation dynamics, and competitive pressure from the hyperscalers. They know which customer segments are sticky versus commoditising. Best used to assess demand quality and competitive dynamics between neoclouds and the Big Three clouds. Open with: "What objection do you hear most often from customers who were interested but went somewhere else?"

Supply Chain Director or Procurement VP at a power infrastructure supplier (Vertiv, Schneider, Eaton). This expert has direct knowledge of order intake, delivery timelines, customer concentration, and competitive dynamics in the most constrained part of the supply chain. They know which transformer or switchgear categories are most supply-constrained and which customers are pulling orders forward versus pushing them out. Best used to validate the credibility of infrastructure delivery timelines. Open with: "Which product line has the longest current lead time, and what would need to change for that to improve?"

Former CFO or Finance Director at a colocation REIT (12-24 months post-departure). Ex-CFOs have the most complete picture of how the financial model actually works in practice: the gap between headline EBITDA and distributable cash, the real cost of development capital relative to what gets disclosed, and the operational cost lines that are hardest to control. Recency matters — knowledge from three or more years ago misses the AI repricing of the market entirely. Open with: "When you were running the numbers internally, what was the gap between what the model said and what the business actually did?"

Grid Interconnection Consultant or Transmission Planner at a utility advisory firm. These individuals work on interconnection studies for specific proposed data centre projects, see the queue in real time, and can describe which projects are likely to receive grid capacity and on what timeline. Best used to challenge or validate publicly announced delivery timelines for specific markets. Open with: "Of the projects in the PJM queue right now related to data centres, what percentage do you expect to reach commercial operation within their announced timelines?"

Real Estate Investment Manager at a fund with data centre exposure. Infrastructure and real estate fund managers who own data centre assets have a portfolio-level view of returns: what deals were structured at, how performance has deviated from underwriting, and how the secondary market for data centre assets is developing. Best used for perspective on capital market dynamics and the gap between public market pricing and private transaction values. Open with: "What assumptions in your original underwriting have turned out to be most wrong, and in which direction?"

Expert sourcing in this sector is genuinely difficult. Former hyperscaler executives who have recently left are the highest-value and scarcest category; compliance functions at their former employers often restrict them from discussing specific data. Neocloud sales and operations people are more accessible but require careful screening: the sector has attracted a large number of people with thin operational backgrounds and strong opinions. The difference between a person who managed a GPU cluster and one who managed the financial model and customer contracts for a neocloud is the difference between an operational view and an investment-relevant view. Woozle pre-screens for the latter.


The Question Bank

A good data centre expert call earns its value in the second half. The first twenty minutes establish credibility and context. The next twenty pull on the threads that earnings calls never address. The questions below are organised by investment theme rather than by expert type, because the most useful signals come from cross-referencing answers across multiple expert categories against each other.

Demand Quality and Customer Concentration

  1. Of the capacity you have committed in the last twelve months, how much was signed with tenants who have been customers for more than three years versus first-time customers?
  2. What percentage of your signed bookings come from customers who are currently using less than 50% of their committed capacity?
  3. If your single largest customer reduced their contracted footprint by 30%, how long would it take you to re-lease that capacity at current market rates?
  4. Are you seeing any customers try to renegotiate signed leases on terms that are not in the contract?
  5. Which customer type is growing fastest right now: hyperscalers building AI training clusters, enterprise inference deployments, or something else?

What you are listening for: signs that demand is committed versus optioned, and the degree to which customer concentration creates event risk that does not appear in backlog metrics.

Power and Delivery Timeline Credibility

  1. For projects you have announced in the last eighteen months, what percentage have a signed utility agreement in place for the full announced power capacity?
  2. What is the current timeline between site selection and the date you can energise the first rack in a new market?
  3. Have you had to revise any announced delivery dates in the last six months, and if so, what caused it?
  4. How much of your transformer and switchgear procurement for 2026 and 2027 deliveries was placed before 2024?
  5. Which markets are you avoiding right now because the grid interconnection timeline is too uncertain to underwrite?

What you are listening for: the gap between what is announced publicly and what is actually deliverable. The transformer procurement question in particular reveals whether the company was operationally sophisticated early or is relying on announced plans with no physical equipment behind them.

Unit Economics and Pricing Dynamics

  1. What did you charge for a 10 MW wholesale lease in Northern Virginia twelve months ago versus today?
  2. At what renewal rate are you seeing customers who signed at the 2021-2022 pricing floor?
  3. What is your actual electricity cost per kW in your top three markets, and how has that changed in the last year?
  4. When a customer at lease renewal asks for a lower rate because they are committing more capacity, how far are you actually moving on price?
  5. Which cost line is most exposed to inflation that you cannot pass through to customers?

Competitive Dynamics and Market Share

  1. Which competitor did you lose your most recent meaningful deal to, and what was the deciding factor?
  2. Are you seeing any new entrants in your core markets pricing below your current rate, and where are they getting the capital to build?
  3. When hyperscalers tell you they are building their own facility rather than leasing from you, what is actually driving that decision?
  4. In the neocloud market, which GPU generation transition caused the most operational disruption, and who handled it best?
  5. What does a customer have to be willing to accept to lease from a new entrant rather than from an established operator?

Technology Transition Risk

  1. When your largest hyperscaler tenant upgrades from H100 to GB200 infrastructure, what changes in the facility requirement?
  2. How much of your current liquid cooling infrastructure is compatible with the next generation of Nvidia hardware without retrofitting?
  3. Are you hearing from customers that their inference workloads are becoming more efficient to run per query, and if so, how is that affecting their capacity planning?
  4. What happens to the economics of a neocloud that bought H100s at peak pricing if the market moves to GB200 supply within eighteen months?
  5. Which cooling technology transition do you see being forced by hardware roadmaps, and what is the timeline?

The best calls in this sector start with an operational question about something specific and recent, earn enough trust to move to pricing and customer dynamics, and end with the technology transition question that the analyst has been building toward all along. The last ten minutes are usually where the call earns its cost.


What to Read

For ongoing market intelligence, Data Center Dynamics and Data Centre Magazine are the two publications with the deepest operational coverage of the sector, tracking both individual facility announcements and macro trends in power procurement, cooling technology, and lease markets. CBRE and Cushman and Wakefield both publish semi-annual colocation market reports with market-by-market pricing and vacancy data — the authoritative source for the $/kW/month benchmarks that underpin most colocation financial models.

For the power supply chain specifically, Wood Mackenzie's data centre power infrastructure research covers transformer and switchgear demand, supply, and lead time data in more granular form than any other public source. Uptime Institute's annual outage analysis and annual data centre survey track PUE trends, outage costs, and operator sentiment across the global industry.

For hyperscaler capex, the quarterly earnings calls of AWS, Microsoft, Google, and Meta are the single best primary source. Reading them in sequence and tracking the specific language around AI infrastructure commitment, delivery milestones, and capacity additions is more informative than most sell-side synthesis. Goldman Sachs's May 2026 piece on AI capex assumptions is the most rigorous public framework for thinking about the magnitude and duration of the current buildout cycle.


How Woozle Can Help

Woozle has run expert calls, surveys, and channel checks across the data centre and AI infrastructure sector for long/short equity funds, specialist tech and infrastructure pods, and PE deal teams underwriting colocation and neocloud transactions. Our network in this space includes former site selection and power procurement executives from hyperscalers, operations leads and sales directors from colocation operators, supply chain directors from power infrastructure suppliers, and grid interconnection consultants who work directly with utilities on large load applications. Research is typically delivered within 24-48 hours of instruction, fully managed: we source, screen, and moderate so your team spends its time on the call rather than on logistics.

To run primary research on data centres, AI infrastructure, or any adjacent area of the compute stack, get in touch.

Similar projects

Subscribe to Fieldwork