Building the Public Data Infrastructure to Accelerate Lab-to-Market Commercialization

By Jesse Lou, Rosie Keller, Teasha Feldman-Fitzthum

Every year, thousands of scientists make the leap from research lab to startup. They carry breakthrough technologies and deep technical conviction — but most have never built a financial model. They're expected to lead teams, make go/no-go decisions, raise capital, and build companies. The question they're rarely equipped to answer is the one that matters most: can this technology work commercially?

The tool that should anchor those decisions is techno-economic analysis: a structured way to evaluate commercial viability, understand what drives cost, identify where uncertainty lives, and connect R&D choices to commercial outcomes. When scientists build a TEA, they develop the intuition they need to navigate the hardest decisions in commercialization.

88% of early-stage climate teams have had no meaningful exposure to TEA.

The Problem

The gap isn't just a problem for individual teams. It's a systemic failure at ecosystem scale.

There are roughly 40,000 early-stage climate teams and 500,000 applied researchers globally, backed by over $100B in public R&D and private capital. That capital flows with the expectation that good science will find its way to deployment. But when the vast majority of teams lack the basic analytical framework to evaluate their own economics, commercially viable technologies stall, resources are misallocated, and capital goes to waste. Further downstream, incumbents struggle to evaluate emerging technologies without a common analytical language, slowing the very deployment that climate timelines demand.

Before Google Maps & GPS, every driver navigated independently — buying paper maps or printing out MapQuest directions, asking strangers for directions, making expensive wrong turns. Routes existed but weren't widely accessible. What was missing was a shared, adaptive navigation layer to make them useful.

Climate commercialization is at the same inflection point. Scientists are developing breakthrough technologies that must be economically feasible to scale. Industrial data exists but is opaque. What's missing is the shared analytical infrastructure: the map layers, navigation tools, and local knowledge that helps scientists find their path to market.

Why Teams Get Stuck

Over 1,500 hours of coaching early-stage teams, we've observed a consistent set of patterns that explain why TEA adoption remains so low — even among teams that understand its importance.

TEA is inaccessible

TEA is seen as a complex, high-precision academic exercise requiring specialized software, months of effort, and niche expertise. When a scientist opens a TEA template or pre-built model and sees fifteen tabs of multi-colored tables and unfamiliar acronyms, many close it and go back to the lab. Those who need it most face the steepest cold-start problem, and each analysis is inherently idiosyncratic — every technology requires its own tailored approach, and makes adapting templates that are close-but-not-quite-enough both overwhelming and insufficient.

Trust is hard to build

Even when teams get started, they struggle to build enough confidence in their analysis to actually make decisions from it. The issue runs on two levels: trusting that the model's structure is sound, and trusting the underlying input data. Without confidence in their assumptions, teams hesitate to commit to the analysis — and without that commitment, they never take the steps that matter most: internalizing sensitivities, identifying the R&D bets that actually move the needle, and communicating their economics with conviction.

The data bottleneck eats months

From our work with early-stage teams, we've seen teams take 20–25 hours to build a solid first pass, with guidance tailored for their needs. But teams routinely spend months getting there, because every assumption becomes a mini research project. What does a commercial-scale electrolyzer actually cost? How is carbon capture typically structured at this process stage? Is this yield assumption reasonable, or am I off by an order of magnitude? Each question can mean days of searching across scattered reports, academic papers, and vendor quotes — after the first few datapoints showing convergence or divergence, teams quickly hit diminishing returns.

Those hours spent hunting for data don't compound. A founder who spends three weeks tracking down electrolyzer cost ranges hasn't learned anything about their own technology — they've just removed a blocker. The real learning happens in model building: understanding how processes connect, stress-testing assumptions, running sensitivities. The data search is pure overhead.

Massive redundancy at ecosystem scale

The lost time and effort compounds across the ecosystem. A team in Cambridge and a team in Singapore independently spend weeks hunting down the same data point. An expert could gut-check that figure in five minutes — they know which sources to trust, what ranges are reasonable, and when something is off. When that validation happens once and is made accessible, it serves thousands of teams. The leverage is enormous, but only if it's coordinated.

Individualized help works, but doesn't scale

We've seen what happens when teams get personalized support through our past services work. Like driving school vs taking an Uber, when teams get behind the wheel and learn to merge onto highways, the synapses connect in their brains. When founders build the model with their own hands, they develop a deep understanding that also helps them accept hard truths about their economics — and empowers them to course-correct as their technology and market evolve.

But one-on-one coaching relies on expert time and is impossible to scale. Together with our TEA practitioner network, we've collectively seen hundreds of teams and have identified common bottlenecks: finding strong assumptions, uncertainties about model outputs, modeling the right level of abstraction. The patterns are consistent enough that shared resources and infrastructure could solve a large fraction of the problem — if they existed.

Our Thesis

There is no shortage of people who want to help. Experienced practitioners, accelerator mentors, program managers at national labs — they exist, they're engaged, and many are actively looking for better ways to contribute. But their efforts are diffuse. An expert answers the same assumption question for the fifth time. A mentor coaches a founder through a model they don't have good data for. A program manager watches cohort after cohort hit the same wall. The opportunity isn't finding more people who care — it's concentrating the efforts of the ones who already do.

AI makes tailored TEA guidance scalable for the first time. By first collecting & validating data for assumptions and pre-building TEAs for all major industrial processes and supply chains, we can leverage language models for what they were built for — bridging communication and meeting teams where they are.

But software alone isn't sufficient. We need humans in the loop (practitioners validating the data, mentors providing boots-on-the-ground support) to help with the adoption and usage of TEA by entrepreneurial scientists.

The TEA Commons is built around three mutually reinforcing pillars:

  • Validated Data as the map layers that make every analysis trustworthy.
  • AI-Enabled Tooling as the turn-by-turn navigation that guides teams through the process.
  • And a Human Network of practitioners and partners — the locals who know the roads and keep the maps honest, whose leverage is multiplied by good infrastructure.

What We're Building

Pillar 1: Data — The Map Layers

The industrial data repository covers all major climate-relevant industries, organized by reference class: the established and emerging industrial processes that breakthrough technologies slot into, displace, or recombine. The scope is roughly 20-25 industry verticals — hydrogen, batteries, carbon capture, biofuels, critical minerals, among others — each containing 30–40 sub-industries or process configurations.

Within each, we collect the structural, process, and economic data a founding team needs: raw material costs, capex benchmarks, performance ranges, product specifications, and carbon intensity.

For each data point, we target rough accuracy: ±30% ranges aggregated from public sources and validated by domain experts. This is the precision early-stage teams actually need for directional decisions — enough to identify which parameters drive outcomes, where to focus R&D, and what questions deserve more scrutiny. It's also fast for experts to provide without running into proprietary constraints, which makes the contribution model viable at scale.

In aggregate, the repository will contain millions of data points, continuously improved as users submit feedback and experts refine validation. The 20% of data that unlocks 80% of the value gets built first — the highest-impact reference classes that recur across teams and verticals deliver substantial value long before the full repository is complete.

Pillar 2: Tooling — Turn-by-Turn Navigation

The AI co-pilot guides teams from a blank sheet to a working model — mapping process flows, surfacing relevant data from the repository, walking through a step-by-step framework tailored for early-stage teams, and flagging where assumptions need scrutiny. It adapts to each team's technology and stage rather than forcing them through a generic template.

The goal is a founder who deeply understands their own model, not one who received an answer from a black box. The teams that get the most value from TEA are the ones who built it themselves, who know what's in every cell and why. The tooling creates productive friction — the AI as a thinking partner, working through the analysis alongside the founder rather than ahead of them.

Underneath the co-pilot sits a TEA calculation engine — a core analytical layer that ensures numerical rigor independent of the LLM. This engine is also useful beyond individual team TEAs: it can support IP evaluation, white space identification, and portfolio-level analysis for investors and program managers.

Pillar 3: Network — The Locals Who Know the Roads

Even the best navigation system benefits from people on the ground. The Network pillar supports two inter-related roles.

Topical experts — TEA practitioners and industry veterans — are the validation backbone of the data repository. An electrochemist who has designed commercial-scale PEM systems can gut-check an electrolyzer capex assumption in minutes and assign a confidence score that makes that data point useful for thousands of teams. These experts also make themselves available for targeted consultations, providing the kind of judgment no database can fully encode.

The exchange goes both ways. Practitioners gain a platform for thought leadership and insight-sharing that builds visibility with the next wave of climate teams — and a natural sales pipeline for their consulting work. They also plug into a community of peers where learnings, methods, and hard-won judgment travel freely, turning what's usually isolated expertise into a shared professional space.

Boots on the groundaccelerator mentors, research program managers, lab ecosystem teams — are the human-in-the-loop for early-stage founders. They know their teams deeply and are often the first call when something feels off. The TEA Commons gives these partners better infrastructure: validated data to reference, tooling to point founders toward, and a community where questions get answered.

Why It Matters

When a founder can show credible economics early, every downstream interaction improves. TEAs travel — into pitch decks, data rooms, partnership conversations, and funding decisions. When the underlying analysis is sound, capital, talent, and policy move faster and with less friction.

The barrier has never been motivation. Scientists, investors, and program managers all recognize the value of rigorous early-stage economics. The problem is that doing TEA well is hard, data is scattered, and adoption doesn't scale without shared infrastructure. Building that infrastructure as a public good also enables more efficient allocation of time & capital across the ecosystem and a common analytical language for innovators, funders, incumbents, and policymakers.

Our thesis for the flywheel: if the data and tooling deliver real value, adoption compounds. Teams find them organically, programs push cohorts to use them, and experts and incumbents contribute data in exchange for access to a stronger pipeline. The same dynamic that made open infrastructure like PubChem foundational to modern biology can work here — each cycle of contribution and use draws in the next wave of participants, and the commons becomes an enabling layer for the entire ecosystem.

Why Now, and Why This Team

The data collection challenge is operational, not research. It doesn't require new AI breakthroughs — it requires sustained, well-coordinated execution across a coalition of domain partners. We can start delivering value now, with the highest-priority reference classes, while the full repository is built out.

Our team brings direct experience across every dimension this effort requires.

  • On TEA, we've worked across the full lab-to-market lifecycle — from pre-spinout research teams to late-stage project finance — including three years leading fusion economics at Commonwealth Fusion Systems and over 1,500 hours of hands-on TEA coaching.
  • On early-stage support, we've scaled dozens of entrepreneur support programs across Asia and Africa to hundreds of founders through Seedstars, and bring deep experience from Breakthrough Energy Fellows.
  • On software and AI, our backgrounds include MIT CSAIL research commercialized into ML-based wind resource assessment (acquired) and building AI for actuarial modeling at Cyence.
  • And across industrials, we bring pattern recognition from McKinsey (oil & gas, chemicals, mining, infrastructure), Breakthrough Energy Fellows, and ARPA-E (blue-tech and CDR ecosystem mapping).

We already have a coalition of 50+ partners across academia, government, accelerators, and investors — including Breakthrough Energy Fellows, MIT, Undaunted (Imperial College), Oxford, Homeworld, and FedTech — and an active pipeline of organizations ready to join.

Join Us

The TEA Commons is designed to be accessible to every climate scientist, every research institution, every accelerator mentor — regardless of geography, stage, or resources.

Funders: We're seeking philanthropic capital to fund the data build-out and tooling development — an investment in public goods that increases the leverage of every downstream dollar spent on climate innovation. If you see the opportunity in building shared analytical infrastructure, we'd love to talk.

Research institutions and ecosystem enablers: We're looking for pilot partners who will push their teams to adopt TEA and contribute domain expertise for specific verticals. If you support early-stage climate teams through accelerators, incubators, or lab-to-market programs, there are concrete ways to work together.

Practitioners and domain experts: The data repository is only as good as the people who validate it. If you're a TEA practitioner or industry veteran who believes in this field, we're building a community for you — one where your expertise reaches thousands of teams instead of a handful.

Industry and incumbents: Your operational data and process knowledge can dramatically accelerate the quality of the repository. In return, a stronger analytical baseline across the ecosystem means a higher-quality pipeline of technologies and teams reaching your door.

We'd be grateful to have you alongside us.

Memo contents

Jump to any section, or scroll to read straight through.

Exposure and/or experience with TEA

Entrepreneurial scientists and proto-company founders (n=102), Recursion Works surveys 2025–2026

~88% little to no TEA exposure
Haven't engaged
43.1%
Self-taught / online
20.6%
1–2 hrs
13.7%
3–5 hrs
10.8%
5–25 hrs
8.8%
25+ hrs
2.9%

How we view the impact of TEA on applied science R&D

Considering economic feasibility increases the leverage of scientists and capital

Less Climate-related · 61%
Climate-related · 39%
Global R&D (Academia, Research Labs)
$651.2B
3.7M researchers
Total figures
Commercial R&D
$322.9B
1.8M researchers
Climate-Relevant Fields
$250.8B
1.4M researchers
Target Ecosystem
$124.4B
0.7M researchers
Fundamental · 50%Commercial · 50%

What scientists tell us

Patterns we hear over and over

As a chemical engineer, I'm familiar with AspenPlus. I know I shouldn't wait and should just start, it just feels overwhelming.

Founder · pre-seed chemicals startup

This table right here looks simple, but it's actually the most complicated and difficult part for us to change in our current TEA… there's a lot of cells that, like, are super dependent on each other, and it's super hard to see where these numbers are coming from.

CTO · seed-stage critical minerals startup

Our approach to TEA: layer-by-layer ‘readiness levels’

We break down TEA development into discrete activities and steps to make it more approachable.

Ecosystem time invested

Hours spent on TEA-related analysis across early-stage segments, before teams typically encounter commercialization gaps.

Ecosystem data-collection hours by team tierBubble chart showing four tiers of early-stage climate teams. As the number of teams grows from roughly 1,000 to 479,000, hours per team fall from 1,000 to 20, but total hours spent on data collection rise from 1.1M to 9.6M.Seed1.1M hrsPre-seed1.6M hrsProto-cos4.1M hrsEnt. scientists9.6M hrs
Total hours: 16.4M hrs
Team counts based on NSF, DOE, Sightline data, hours invested estimated by TEA Commons team.

Scaling support for climate teams

2021 · The seed

Scaled support
1:1 coaching
2021202220232024202502004006008001KScientists supported

Self-reinforcing pillars

Click a pillar to explore its relationship with scientists

SpeedCredibilityFeedbackNew dataInsightsDecisionsFeedbackTestingMentorshipResourcesJobsServicesDataNetworkToolingScientists

Reference class taxonomy

Example of 25 industry verticals spanning all major processes

5 buckets · 25 categories

Have feedback? Please send us a note at data@tea-commons.org or leave your info at the bottom of this memo!

Process flow diagram

Drag nodes, pan, and zoom to explore.

Example reference class: a bioethanol-from-cellulose process, with feedstocks, unit operations, product streams, and energy flows.

Computing layout…

The TEA Conductor: our step-by-step guided workflow

From process setup to sensitivity dashboards. Try the demo →

Set Up Your TEA

Human-in-the-loop validation

Practitioners anchor data in realistic ranges, suggest corrections, and guide teams on specific needs.

Mock-up for expert data validation approach

Human-in-the-loop data validation interface

From the field

Stories from accelerator mentors and ecosystem partners

“Teams who don’t know their TEA well will discount the outcome — they tell me that they ‘feel’ like their costs should be lower even if the model suggests otherwise, because there are other things that aren’t captured.”

— University Incubator Program Manager

“We always hear feedback like ‘we want to see more examples, what does a TEA look like’ — especially for teams who don’t know Excel or don’t know modeling, they actually want to see more.”

— Accelerator Manager

Theory of Change

How supporting TEA Commons compounds into ecosystem-scale impact.

Inputs
Open data repositoryAI TEA toolingTEA Network
Outputs
Scientists supportedTeams supported
Short-term Outcomes
Hours savedFatal flaws identified
Medium-term Outcomes
Researcher-years returnedSpinouts formedCapital unlocked
Long-term Outcomes
Jobs createdCO₂ avoidedClimate deployment

Our team

Click a card to read their bio.

Jesse LouJesse Lou
Executive Director
Breakthrough Energy, McKinsey, Columbia BS, Harvard MBA

Focuses on hard tech strategy, techno-economic analysis, and early-stage innovation. Co-founded Recursion Works, was a fellow at Breakthrough Energy, and previously worked at McKinsey and as a product lead in AI.

Rosie KellerRosie Keller
Executive Director
ARPA-E, Newlab, Seedstars, MIT Sloan MBA

Global experience in entrepreneurship and ecosystem building. Worked at Newlab and ARPA-E on TEA & LCA playbooks, and led Seedstars' expansion across Asia and Africa, scaling programs to hundreds of founders.

Teasha Feldman-FitzthumTeasha Feldman-Fitzthum
Founding Director
Commonwealth Fusion Systems, MIT BS

Leads TEA at CFS, applying probabilistic modeling to guide R&D & strategy. Founded & sold a renewable energy analytics company after conducting ML research at MIT CSAIL on wind farm siting.

Our coalition

50+ partners across academia, government, accelerators, and investors.

Breakthrough Energy Fellows
Martin Trust Center
Columbia CDI
Labstart
Newlab
Homeworld Collective
Harvard
TUM
FUEL
Undaunted
Oxford Zero
Nucleate
FedTech
NYCE
Ashwatta
Brinc
Genopole
MIT Climate Project
Work on Climate
Third Derivative
MIT

Get involved

Share your interest and we'll follow up.

0 / 2000