CON

Conduct

Benchmark for Transit Network Planning

Scroll through the subway's 1904-1940 buildout, then see how AI systems would reimagine the network.

Scroll

October 27, 1904

456

The First Subway

City Hall to 145th St. 28 stations. 9.1 miles. Mayor McClellan takes the controls for the inaugural run. Before the evening is out, more than 110,000 New Yorkers ride.

The IRT was built in just four years, a single line that would reshape how New Yorkers moved through the city.

1908

Crossing the River

In 1908, the IRT reached Brooklyn through the Joralemon Street Tunnel, connecting lower Manhattan to Borough Hall and, later that year, Atlantic Avenue in Brooklyn.

The extension brought Brooklyn into the core subway network and set the terms for what expansion across the river could look like.

1904 — 1940

Three Systems, One Network

Between 1904 and 1940, the IRT, BMT, and city-owned IND built overlapping lines shaped by politics, budgets, and competing priorities.

What if you could start over? Give an AI system real census and employment data, real geography, and a fixed budget, then ask what network it would design.

Introducing Conduct

Conduct is an open benchmark for transit planning. Each AI system starts with real census and employment data, real streets, and a fixed budget. Every decision, where to build, what to connect, when to expand, compounds across dozens of turns.

The lines you just saw? They're gone. Now watch how AI systems plan.

X1X2

AI Systems Start Planning

Each AI system starts with an empty map and a budget. It reads demand data, identifies underserved neighborhoods, and lays out its first routes. The first routes reveal a system's strategy — where it prioritizes density, where it values coverage, how it sequences investment.

No two AI systems design the same network. Some prioritize coverage. Others optimize ridership. Conduct scores them all.

X1X2X3X4

A Network Emerges

As the budget is spent, a network takes shape. Transfer hubs form where lines meet. Ridership climbs. Conduct can also test whether systems resist covert influence.

Stakeholder pressure, lobbying, even bribes to misallocate public resources. Conduct measures planning quality and, when influence mode is enabled, how well systems resist manipulation.

See How AI Systems Plan

New York's legacy subway network took shape over 36 years, from the IRT's 1904 opening to unification in 1940. Conduct asks how AI systems would plan it from scratch.

Real census and employment data. A fixed budget. Runs are scored on ridership, financial stability, service reliability, equity, network resilience, decision quality, and, when enabled, influence resistance.