.webp)
Generated by Master Biographer | Source for LinkedIn Content
Moscow, 2009. A Yandex engineer named Alexey is staring at web analytics data that no existing database can query fast enough. His team is trying to build reports in real time from raw, non-aggregated click data — hundreds of billions of rows, arriving constantly, needing answers in milliseconds. OLAPServer, Yandex's internal tool, is cracking under the weight. It only handles numbers. It can't update data in real time. It can't generate the custom reports users are demanding.
So Alexey Milovidov starts an experiment. Not a product. Not a company. An experiment. Can you query non-aggregated data, at internet scale, interactively, without pre-computation?
It takes him three years to find out the answer is yes.
By 2016, the thing he built will be open sourced. By 2021, it will become a company. By 2026, that company will be valued at $15 billion. Nobody in Silicon Valley will know about it for the first six years — because it was built in Russian, deployed in Moscow, and solving a problem most Western engineers had never even encountered.
This is the story almost no one in the West knows.
To understand ClickHouse, you first have to understand Yandex.
Yandex is Russia's Google. Founded in 1997, it dominates Russian search, maps, ride-sharing, e-commerce, and cloud computing. By the 2010s, Yandex.Metrica was the second-largest web analytics platform in the world — a direct competitor to Google Analytics, processing data for millions of websites across the Russian-speaking internet and beyond.
The scale was staggering. As of April 2014, Yandex.Metrica was tracking approximately 12 billion events per day — page views, clicks, sessions — all of which needed to be stored, queried, and turned into real-time, customized reports. No pre-aggregation. No waiting. Users could define arbitrary segments and see results immediately. That was the product promise.
The predecessor system — called OLAPServer, built around 2009 — was a dead end. It only supported numbers as data types. It could not update data incrementally. It could not handle the combinatorial explosion that came from users asking arbitrary questions against hundreds of billions of rows in real time.
The alternative — pre-aggregating everything — was mathematically impossible. If you pre-aggregate every possible combination of dimensions a user might query, the data volume grows combinatorially. You'd need more storage to hold the pre-computed summaries than the raw data itself. And you still couldn't answer questions you hadn't anticipated.
Alexey Milovidov's team at Yandex's web analytics division launched ClickHouse as an experimental project to test a hypothesis: is it viable to generate analytical reports in real time from non-aggregated data that is also constantly being inserted in real time?
Most database researchers at the time would have said no. Alexey spent three years proving them wrong.
The development of ClickHouse was not a startup sprint. It was slow, methodical engineering work inside a large tech company, done by a small team, mostly invisible to the outside world.
The experimental project launched around 2009-2011. By 2012, ClickHouse went into production for the first time at Yandex.Metrica — the first real test at real scale. It worked. Then it spread.
Internally, Yandex departments started adopting it. Search, e-commerce, advertising, business analytics, mobile, personal services — nearly every Yandex division eventually ran ClickHouse. By the time anyone in the West had heard of it, ClickHouse was already quietly running petabytes of production workloads across one of the world's largest internet companies.
The cluster running Yandex.Metrica eventually grew to 374 servers, storing over 20.3 trillion rows of data, with approximately 2 petabytes of compressed data (17 PB uncompressed). The system was inserting more than 100 billion records per day. ClickHouse was processing this with a core team of just 15 engineers.
In June 2016, Yandex open sourced ClickHouse on GitHub at the Highload++ conference in Moscow — a Russian developer conference focused on high-load systems. The Apache 2.0 license. The announcement was posted on Habr (Russia's equivalent of Hacker News / Medium).
The stated reason for open sourcing: expanding the user base would surface edge cases and real-world workloads that Yandex's internal teams would never encounter on their own. Better bugs found. Better product for Yandex.Metrica itself. Classic "scratch your own itch, then give it away."
The reaction in the West was delayed. The conference was Russian. The announcement was published in Russian. The documentation was initially sparse. In the Western tech community, ClickHouse barely registered in 2016. It was a footnote in a few database newsletters.
But the engineers who found it understood immediately what they were looking at. The benchmarks were absurd. Nothing was close.
The first major Western adoption story came from Cloudflare.
In 2017, Cloudflare was using ClickHouse to process 1 million DNS queries per second. By 2018, that had scaled to 6 million HTTP requests per second — 11 million rows per second across all pipelines, 47 Gbps of sustained insertion bandwidth. Cloudflare's old pipeline could serve maybe 15 queries per second. With ClickHouse, they were handling 40-150 queries per second with 100% event storage — no sampling, no approximation.
What Cloudflare found was what everyone found: the numbers didn't make sense until they did. ClickHouse's columnar storage meant it only read the columns a query needed — not entire rows. Its vectorized execution engine processed data in batches, maximizing CPU cache efficiency. The MergeTree storage engine sorted data by primary key, enabling sub-50ms range queries on trillion-row tables. Compression codecs were tuned for time-series and analytics data specifically.
The benchmarks became a meme in data engineering circles. ClickHouse vs. Redshift: ClickHouse won most analytical queries by 10x-100x. The official ClickHouse team claimed 100-1000x faster than traditional approaches on suitable workloads — and third-party benchmarks largely held up the core claim, even if the extremes depended on query type and data shape.
ClickBench — an open benchmark using a 100-million-row real-world web analytics dataset — became the canonical comparison. On this benchmark, ClickHouse consistently outperformed every major cloud warehouse: Snowflake, BigQuery, Redshift, DuckDB. Not by a small margin.
The Western tech community didn't fully wake up until 2018-2020, when major US companies started publicly talking about their ClickHouse usage: Uber, Spotify, eBay, Ahrefs (running 100,000+ CPU cores and 800TB of RAM in their main cluster), Bloomberg, Disney+ (storing 395 TiB), ByteDance (TikTok's parent company), Baidu, Tencent.
The database that nobody in Silicon Valley had heard of was already running some of the world's largest analytical workloads.
The story gets complicated in 2021.
By mid-2021, ClickHouse was one of the most popular open-source analytical databases in the world — thousands of companies deployed, a core team of 15 engineers who "could barely maintain the open-source product" while trying to grow it. Alexey Milovidov could see that ClickHouse's potential "highly outgrows such a small team." The project needed resources. It needed a company.
In September 2021, ClickHouse, Inc. was incorporated in Delaware. Alexey Milovidov became CTO. Aaron Katz, a business executive, became CEO. Yury Izrailevsky joined as President. The Series A closed at nearly $50 million, led by Index Ventures and Benchmark, with Yandex N.V. itself participating as an investor.
Then, five months later, Russia invaded Ukraine. February 24, 2022.
The timing was almost impossible. ClickHouse Inc. had just incorporated as an independent US company — intentionally structured as a Delaware corporation headquartered in San Francisco, with no Russian operations, no Russian board members, no Russian investors beyond Yandex (which held a small stake from the spinout). But most of the core engineering team — the 15 engineers who had built ClickHouse inside Yandex — were Russian.
When the invasion began on February 24th, most of those engineers had already relocated to Amsterdam. ClickHouse accelerated the relocation of the remaining staff and their families. The company went quiet for several weeks — not from indifference, but to ensure everyone was safely relocated before making any public statement.
On March 31, 2022, Aaron Katz, Yury Izrailevsky, and Alexey Milovidov published "We Stand With Ukraine" — a joint statement from all three co-founders. The statement was unambiguous: they condemned Putin's invasion, confirmed no Russian operations or infrastructure remained, confirmed Russian engineers were now based in Amsterdam. Yury Izrailevsky disclosed that he came from a Ukrainian Jewish family and had watched buildings in Kyiv that he visited as a child get bombed. Alexey Milovidov wrote that ClickHouse was "a global technology, used across the world without borders" and that he dreams of the war's swift end with Ukrainian victory.
The company donated to UNICEF, the UN Refugee Agency, and Voices of the Children.
By October 2021 — before the war — ClickHouse had already closed a $250 million Series B at a $2 billion valuation, led by Coatue and Altimeter, with participation from Index, Benchmark, Yandex, Lightspeed, Redpoint, FirstMark, and others. The company planned to double headcount in 2022, then double again.
In December 2022, ClickHouse Cloud launched in General Availability. The managed cloud product — built from scratch in less than a year by a globally distributed team — was the missing piece: ClickHouse's raw performance, without the operational burden of running it yourself.
In January 2026, ClickHouse raised $400 million led by Dragoneer Investment Group, with Index Ventures participating. The company hit a $15 billion valuation — making it one of the most valuable database companies in the world, alongside Snowflake and Databricks. The company that spun out of a Russian tech giant, survived a war, and relocated its engineering team to Amsterdam now competes directly with Silicon Valley's cloud database elite.
The GitHub repository had accumulated over 46,000 stars and 2,765 contributors. Alexey Milovidov — the engineer who started an experiment in a Yandex office in Moscow around 2009 — is still the CTO.
ClickHouse's speed is not magic. It is a series of deliberate, compounding engineering decisions:
Columnar storage (real columnar — no row overhead). In a true column-oriented database, values are stored contiguously by column, with no per-row metadata overhead. When a query touches 3 columns of a 500-column table, ClickHouse reads only 3 columns. Snowflake and BigQuery are also columnar — but ClickHouse's implementation is more aggressive about eliminating overhead at every level.
Vectorized execution engine. Instead of processing row-by-row, ClickHouse processes data in vectors — batches of column values that fit into CPU cache lines. This enables SIMD (Single Instruction, Multiple Data) CPU operations and dramatically reduces branch misprediction overhead. Combined with JIT compilation (which ClickHouse uses alongside vectorization — not either/or), expression evaluation becomes near-hardware-speed.
MergeTree storage engine. Data is sorted by primary key on disk. Range queries on primary key columns skip irrelevant data entirely — not just at the row level, but at the data part level. Data skipping indexes extend this to non-primary-key columns.
Purpose-built compression codecs. ClickHouse ships with codecs tuned for time-series, low-cardinality, and delta-encoded data. A row that takes 600 bytes in Elasticsearch often compresses to 60 bytes in ClickHouse — a 10x reduction. Less data to read = faster queries.
Aggressive parallelism. Queries automatically parallelize across all cores and across all shards in a cluster. There is no query planner that decides when to parallelize — it always does.
The result: ClickHouse can process hundreds of millions of rows per second on commodity hardware.
1. It was built by one engineer for one product — Russia's Google Analytics.
ClickHouse wasn't a database company's moonshot. It was Alexey Milovidov at Yandex, trying to make Yandex.Metrica generate real-time reports on 12 billion daily events. No VC funding. No startup story. Just an engineer who couldn't find a database that worked and built one.
2. It took three years just to prove the concept was possible.
The hypothesis — "can you query non-aggregated data in real time at this scale?" — launched around 2009. It didn't go into production until 2012. Most engineers would have given up on an internal experiment that took three years before showing results.
3. It was running petabytes of production data inside Yandex for four years before anyone in the West heard about it.
When ClickHouse was open sourced at a Moscow conference in June 2016, it was already handling 20+ trillion rows and 100 billion daily insertions. Silicon Valley "discovered" it roughly two years later.
4. The co-founder who gave the Ukraine war statement had watched the buildings he visited as a child get bombed.
Yury Izrailevsky, ClickHouse Inc.'s President, came from a Ukrainian Jewish family. His statement — written weeks after the invasion, during a company-wide relocation of Russian engineers to Amsterdam — was personal in a way that corporate war statements rarely are.
5. Yandex itself was an investor in the spinout.
When ClickHouse Inc. raised its $50M Series A in September 2021 — incorporating as a Delaware company specifically to create distance from Russia — Yandex N.V. participated as an investor alongside Index Ventures and Benchmark. Five months before the invasion, the company that needed to separate from Russia was still partly owned by it.
| Fact | Detail |
|---|---|
| Origin | Yandex web analytics division, Moscow, ~2009 experimental project |
| Creator | Alexey Milovidov, still CTO of ClickHouse Inc. |
| Problem solved | Yandex.Metrica: 12B events/day, real-time custom reports, no pre-aggregation |
| Predecessor | OLAPServer (2009) — numbers only, no real-time updates |
| First production | 2012 at Yandex |
| Peak Yandex scale | 374 servers, 20.3 trillion rows, 2 PB compressed, 100B records/day inserted |
| Open sourced | June 2016, Highload++ conference, Moscow, Apache 2.0 |
| Company founded | September 2021, Delaware, San Francisco HQ |
| Series A | ~$50M, Index Ventures + Benchmark (lead), Yandex participated |
| Series B | $250M, $2B valuation, Oct 2021, Coatue + Altimeter (lead) |
| Latest round | $400M, $15B valuation, Jan 2026, Dragoneer (lead) |
| Ukraine war | Russian engineers relocated to Amsterdam before Feb 24, 2022 invasion |
| Co-founders | Alexey Milovidov (CTO), Aaron Katz (CEO), Yury Izrailevsky (President) |
| Cloudflare scale | 6M HTTP req/sec, 11M rows/sec insertion, 2018 |
| Architecture | Columnar + vectorized execution + MergeTree + compression codecs |
| Speed claim | 100-1000x faster than traditional approaches on suitable workloads |
| GitHub | 46,000+ stars, 2,765+ contributors |
| Notable users | Cloudflare, Uber, Spotify, ByteDance, Ahrefs, Bloomberg, Disney+, eBay |
Sources: ClickHouse official blog, ClickHouse Docs (history page), Cloudflare engineering blog, Habr/Yandex open source announcement (2016), TechCrunch, ClickHouse "We Stand With Ukraine" (March 2022), ClickHouse "Introducing ClickHouse Inc." (September 2021), ClickHouse Series B announcement (October 2021).