.webp)
Generated by Master Biographer | Source for LinkedIn Content
Somewhere around 2013, Paul Dix was staring at his own code and realizing he'd built the wrong thing.
Errplane was supposed to be a monitoring product — a SaaS tool that would catch application errors, plot server metrics, and alert engineers when something broke. He and co-founder Todd Persen had built it, applied to Y Combinator with it, gotten in, and spent months trying to get developers to pay for it. The product was fine. The market was stubborn.
But underneath the monitoring product, there was something else. A layer that ingested timestamped data points at massive speed, compressed them, stored them efficiently, and retrieved them by time range. A purpose-built storage engine for a specific kind of data: data that always had a clock attached to it. Data that arrived in sequence and needed to be queried backward through time.
And Dix kept noticing that every DevOps team, every IoT company, every financial data engineer — they were all building some version of this layer on their own. In isolation. Redundantly. Treating it as a side problem while they focused on their actual product.
That was the moment the product changed. Not Errplane. The thing underneath Errplane.
There was only one problem: the thing he was building didn't have a name. There was no term for it in the industry. No analyst category. No conference track. No search query that would pull up what he was looking for.
So he invented one.
He called it a time series database.
Paul Dix was not a database researcher. He was not coming out of academia with a PhD in storage systems. He was a working software engineer — the kind who builds production systems, gets burned by the tools available, and eventually decides to fix them.
Before InfluxDB, he was known for a different kind of output: books. In 2010, O'Reilly published Service Oriented Design with Ruby and Rails — a technical text Dix wrote for engineers building distributed systems in the late Rails boom. It was a careful, practitioner-focused book for a practitioner's problem. It was not glamorous. It was useful.
That biography — technical author, working engineer, Ruby and Rails ecosystem — is a specific kind of founder. Not a researcher-turned-founder, not a VC-magnet with a pitch deck, but someone who had spent years explaining things to other engineers, who thought in systems, and who wrote down what he figured out.
It's the kind of person who, when he can't find a name for a category, makes one up. And moves on.
In 2012, Dix and Persen registered Errplane as a company in New York. The product they pitched was positioned as real-time application monitoring — a startup-grade alternative to what New Relic and Datadog were building for enterprise. You'd instrument your application, send events and metrics to Errplane, and see what was breaking.
They got into Y Combinator in 2013. On their YC application, buried in the secondary ideas section, Dix mentioned "an open source time series database." He didn't lead with it. It was an afterthought.
The monitoring product didn't gain traction. The database idea wouldn't leave him alone.
In September 2013, Dix flew to Berlin for Monitorama — a conference for the observability and monitoring community. He expected to find competitors. He found a pattern instead.
In the room, there were two kinds of people: employees at monitoring companies guarding their time series storage implementations as proprietary intellectual property, and engineers at large organizations who had built their own internal time series systems because nothing off the shelf worked well enough.
Everybody in both groups had solved the same problem. Nobody had shared their solution.
Dix drew a conclusion he would later articulate in a talk: "When everyone is repeating the same work, it's not key intellectual property. It's a barrier to entry."
He flew home and told Persen they were pivoting.
In late September 2013, Dix, Persen, and one additional engineer locked themselves in a room and carved the time series storage layer out of Errplane. They stripped it down to its essential function — write timestamped data at high speed, store it efficiently, query it by time range — and rebuilt it as a standalone open-source database.
The first commit to what would become InfluxDB landed on September 26, 2013.
Dix wrote documentation. He started giving talks at NYC developer meetups in November 2013. In those talks, he used the phrase "time series database" as if it had always existed. He needed a label and didn't have one, so he built one out of the two most obvious descriptive words available.
The documentation site hit the front page of Hacker News and stayed there for most of the day. O'Reilly Radar picked it up. Within a year, InfluxDB had 2,500 active servers in production, 3,000 GitHub stars, 47 contributors, and 17 client libraries. Heroku, Google's cAdvisor, and OpenStack were already using it.
The demand was not manufactured. It had been bottled up for a decade.
When Dix and the team chose a language for InfluxDB, they picked Go.
In late 2013, this was not the safe choice. Go was three years old. It had minimal third-party ecosystem. It was fast, yes — garbage collected but noticeably faster than Ruby or Python for server workloads — and it had excellent concurrency primitives, which mattered for a database that needed to handle thousands of simultaneous writes. But it was unproven for production infrastructure at serious scale.
Building InfluxDB in Go turned out to be a bet that paid both of them dividends. For InfluxDB: Go's compiled binaries meant no runtime dependencies, trivial cross-platform distribution, predictable memory behavior. For Go: InfluxDB became one of the first high-profile, production-grade open-source infrastructure projects written entirely in the language. Engineers who heard about InfluxDB looked at its code and learned Go. Blog posts about InfluxDB's architecture became primers on Go's concurrency model.
Go was trying to establish itself as a serious language for systems programming. InfluxDB helped it do that. The relationship went both ways.
The first version of InfluxDB used LevelDB, Google's Log-Structured Merge Tree implementation. It was a reasonable default. It was wrong for the workload.
LevelDB splits data across many small files. InfluxDB's model created a separate "series" for every unique combination of measurement name and tag values. A real production deployment with dozens of databases and months of data could open hundreds of LevelDB instances simultaneously — and run out of operating system file handles. The database would crash. Not gracefully.
The team replaced LevelDB with BoltDB — a pure-Go B+ Tree using memory-mapped files. Beautiful. Elegant. Crashed differently.
BoltDB's fatal flaw was write performance. Time series data looks sequential, but it isn't — thousands of individual series update simultaneously. From BoltDB's perspective, those writes scattered randomly across the tree structure. As the database grew past a few gigabytes, disk IOPS spiked and write throughput collapsed.
The third engine — built entirely in-house — was the one that worked. The Time Structured Merge Tree, announced in 2015 as TSM, was the LSM Tree concept rebuilt specifically for time series access patterns: columnar storage so querying one field didn't read all others, time-aware compaction so temporal blocks merged as data aged, compression ratios that achieved roughly 2 bytes per stored data point versus Graphite's 12 bytes per point. Forty-five times less disk space than the previous version on the same dataset.
The storage engine was rewritten twice before it worked correctly. The category was still being named while the infrastructure underneath kept failing.
Dix understood that a database alone was not a business. The ecosystem around it mattered as much as the core.
Between 2015 and 2017, InfluxData released three companion projects that transformed InfluxDB from a standalone data store into a complete observability pipeline:
Telegraf — an open-source, plugin-based metrics collection agent. You installed one binary. It spoke to 200+ input sources: system metrics, application frameworks, database systems, cloud providers, industrial protocols. It wrote everything to InfluxDB. For a DevOps team trying to get metrics off of dozens of different systems, Telegraf was the answer to the first question: how do I actually get data in?
Chronograf — a web-based visualization and dashboarding tool. Not as powerful as Grafana, but native to InfluxDB, and designed by people who understood the query model.
Kapacitor — a real-time stream processing and alerting engine. It could run continuous queries against incoming data, apply anomaly detection, and fire alerts to PagerDuty, Slack, HipChat, or email.
Together: Telegraf, InfluxDB, Chronograf, Kapacitor. The TICK Stack. One acronym for a complete pipeline from data collection to visualization to alerting.
For an industry full of engineers stitching together five different tools that barely talked to each other, this was significant. The TICK Stack gave DevOps teams something coherent: a full pipeline with a single vendor responsible for making it work end to end.
TSM was fast. But it had a structural ceiling.
InfluxDB 1.x kept its index in memory. Every unique combination of measurement name plus tag key plus tag value lived in RAM. For traditional infrastructure monitoring — a fixed set of hosts, a fixed set of metrics — this was fine.
Then Kubernetes arrived.
Kubernetes made infrastructure ephemeral. Containers spun up, ran for minutes, and died. Each one created new tag combinations: container IDs, pod names, deployment hashes, git commit SHAs embedded as labels. In a Kubernetes cluster under active deployment, the number of unique series could grow by millions in a single week. None of those old series would be queried again. But they stayed in memory.
Users running Kubernetes monitoring on InfluxDB found their memory consumption growing indefinitely. A database for metrics was consuming gigabytes of RAM just to remember the names of containers that had stopped running months ago.
The fix — the Time Series Index, or TSI — moved the index from memory to disk-backed memory-mapped files. Stated goal: a billion unique series without a RAM ceiling. It shipped in 2017. It worked well enough to buy time. It did not solve the deeper architectural problem, which was that the Go-based TSM engine, as designed, had inherent limits that could not be patched away.
The real fix would require starting over.
In 2020, InfluxDB 2.0 shipped with Flux — a brand-new query language designed by InfluxData to replace InfluxQL, the SQL-like language InfluxDB had used since the beginning.
The instinct was not unreasonable. InfluxQL had limitations. It couldn't easily join across measurements. It had no concept of transformations or custom functions. A purpose-built language for time series pipelines, designed as a functional data scripting language, sounded like progress.
The execution was a disaster.
Users were told they needed to complete coursework through "InfluxDB University" before they could write queries. The syntax was unfamiliar enough that experienced engineers needed hours to accomplish what they'd been doing in minutes with InfluxQL. Migration guides were long. The learning cost was high and the payoff was unclear.
The community resisted. Vocal users called it overengineered. But InfluxData pressed forward. Users who decided to commit to the new platform spent months learning Flux, migrating their applications, rewriting their dashboards.
Then InfluxDB 3.0 deprecated Flux entirely and replaced it with SQL.
The users who had migrated to Flux, and only to Flux, were left holding a skill with no future. The total migration effort — from InfluxQL to Flux to SQL — was described by multiple users as comparable in workload to simply migrating to a different time series database altogether.
Flux is the chapter InfluxDB would redo if it could.
The phrase Paul Dix had started using in meetup talks in November 2013 entered the analyst vocabulary within two years.
DB-Engines, which tracks database popularity and category adoption, created a "time series database" classification. The category grew faster than any other segment in the database industry for two consecutive years. By 2017, "time series database" appeared in Gartner reports, conference keynotes, and job descriptions at companies that had never heard of Monitorama Berlin.
The category was named by a founder who needed a word for what he was building. Once named, it became something people could search for, evaluate, and buy. The naming act was not incidental to InfluxDB's growth — it was structural. You cannot grow a market you haven't given a name.
By 2019-2022, the deployment list was striking in its variety:
Tesla: Over 50,000 Powerwalls monitored in real time. Tesla's manufacturing lines generated 1.5 terabytes of time-series data daily from 50,000 sensors across factory floors. InfluxDB's tag-based indexing let engineers filter across thousands of dimensions instantly.
CERN: The monitoring stack for the Large Hadron Collider runs on InfluxDB. CERN sends 3.4 terabytes per second to InfluxDB — not per day, per second — from detector arrays, accelerator infrastructure, and data center monitoring.
NASA, Visa, Airbus, Siemens, PayPal, IBM, Salesforce: All confirmed customers by 2022.
InfluxDB Cloud: Adding over 6,000 new users per month by 2023.
This range — from particle physics to payment processing to EV batteries — was the point. Timestamps were not a niche data structure. They were everywhere that anything moved, measured, or needed to be observed over time. The category Paul Dix had named in a Berlin conference in 2013 was not a niche. It was a horizontal.
As InfluxDB was cementing itself as the time series database for longer-duration historical storage and complex analytics, a different project had quietly won the Kubernetes metrics war.
Prometheus, born inside SoundCloud in 2012 and donated to the Cloud Native Computing Foundation in 2016, had become the de facto standard for cloud-native metrics collection. Its pull-based model, where Prometheus scraped metrics directly from application endpoints, fit Kubernetes's architecture perfectly. Its query language, PromQL, was purpose-built for the alert patterns Kubernetes operators needed.
By 2018-2019, if you were running Kubernetes, you ran Prometheus. It was not a competition — it was a default.
InfluxDB and Prometheus were not identical tools, but they overlapped enough that every DevOps team had to decide between them, or figure out how to run both. Prometheus won the metrics collection and alerting tier in cloud-native environments. InfluxDB retained the longer-duration storage tier, the IoT sensor tier, and the use cases that needed higher cardinality, richer query analytics, or write rates beyond what Prometheus was designed for.
The official response was pragmatic: InfluxDB added Prometheus remote write support, so Prometheus users could forward data to InfluxDB for long-term storage. Instead of competing head-on, InfluxDB positioned itself as where your Prometheus data lived after 15 days. Coexistence rather than displacement.
The honest assessment: Prometheus won the Kubernetes-native metrics mindshare permanently. InfluxDB lives upstream and downstream of that fact.
In November 2020, Paul Dix announced InfluxDB IOx.
The name was a nod to chemistry: iron oxide, the compound in rust. The new engine would be written in Rust. All of the existing Go codebase — hundreds of thousands of lines representing seven years of engineering work — would eventually be retired.
IOx was not a minor refactor. It was a complete architectural reinvention.
The stack Dix chose: Apache Arrow for in-memory columnar data representation, Apache Parquet for durable on-disk columnar storage, Apache DataFusion as the query execution engine, and Arrow Flight as the query transport protocol. He gave this combination a name — the FDAP stack — and bet the company on it.
The motivation was honest. Go had served InfluxDB 1.x well. But it was not going to solve the problems that needed solving:
The custom TSM engine could not scale to unlimited cardinality. Go's garbage collector introduced latency spikes under heavy load — acceptable for application code, problematic for a database that needed deterministic write latency. TSM was a proprietary format that nothing else could read — no interoperability with Snowflake, Databricks, data lakes, or analytical tools. And SQL support could not be retrofitted onto the architecture without fundamental redesign.
Rust eliminated the garbage collector. Arrow gave InfluxDB a columnar in-memory format with a massive open ecosystem. Parquet gave InfluxDB a storage format that every major data platform already spoke. A query against InfluxDB 3.0 data could be answered directly by DuckDB, Snowflake, or any Arrow-native client.
The community reaction was complicated.
InfluxDB's open-source users had built real production systems on the 1.x and 2.x codebase. The announcement that the engine was being replaced — that all of this was a multi-year project with no guaranteed timeline — created anxiety about the roadmap. Flux had already damaged trust. Now a full rewrite on top of Flux deprecation sent a signal that InfluxData was willing to leave users behind in pursuit of architectural correctness.
Some users migrated to TimescaleDB, which was built on PostgreSQL and therefore carried an implicit stability guarantee. Some moved to QuestDB. Some waited.
Paul Dix's admission, years into the project: "When we selected these projects as the database core in 2020, it wasn't evident that they would be adopted as broadly as they have been." He bet on a stack before it was obvious, and the stack proved him right. But the years of uncertainty cost community trust.
InfluxDB 3.0 arrived in 2023. The claims versus the previous generation were not modest:
100x faster queries on analytical workloads. 10x ingest performance. Unlimited cardinality — no more memory ceiling. Native SQL support, plus InfluxQL compatibility for version 1 migrations. Tiered storage: hot data in local memory, cold data in object storage using any S3-compatible bucket. Parquet on disk — InfluxDB data directly readable by Snowflake, Databricks, DuckDB, or any Arrow client without ETL.
The FDAP bet had paid off technically. Whether it paid off commercially — whether the years of rewrite cost more in community trust than the performance gains recovered — is a question the market is still answering.
InfluxDB 3.0 had a complicated licensing story. Earlier versions of the 3.x codebase were not fully open source — a decision that generated significant criticism from a community that had been built on open-source principles.
On January 13, 2025, InfluxData announced InfluxDB 3 Core in public alpha under a dual MIT/Apache 2 license: the most permissive licensing in the project's history. Enterprise features — high availability, read replicas, historical query beyond the rolling window — live in the paid tier. The core engine is free, permanently.
By 2025, InfluxData commanded an estimated 58% market share in the time series database category. Total funding exceeded $200 million. Revenue reached $75 million in 2024.
The company that started as a pivot from a failed SaaS monitoring tool, that named a category in meetup talks, that rewrote its storage engine three times, that survived a query language fiasco and a four-year architectural reinvention — now owns the category it named.
Paul Dix spent eleven years proving that the thing underneath Errplane was the actual product. He was right. He just needed most of the decade to prove it completely.
1. Paul Dix didn't discover the time series category — he named it.
Before Dix started giving talks in New York in November 2013, there was no agreed-upon term for a database designed around timestamped sequential data. RRDtool, Graphite, and OpenTSDB all existed. But nobody called them "time series databases." That phrase didn't appear in DB-Engines category listings. It wasn't in analyst reports. Dix started using it because he needed a label for what he was building — not because it was an established term he was borrowing. He named a category and then grew into it. Within two years, "time series database" was the fastest-growing segment in the database industry. Naming is strategy. Dix named his way into a market.
2. Paul Dix was a technical book author before he was a database founder.
His first published work was "Service Oriented Design with Ruby and Rails" — an O'Reilly book from 2010 for engineers building distributed systems in the Rails ecosystem. This matters because it reveals who Dix was before InfluxDB: someone who built things in production, encountered problems, thought through them carefully enough to write them down, and shared them. The instinct that made him coin "time series database" in a meetup talk — the instinct to name and explain things — was already there in 2010 when he was explaining SOA patterns to Ruby developers. InfluxDB was written by a man who thought of technical concepts as things that needed clear names before they could be built.
3. The pivot from Errplane was made at a conference in Berlin after watching competitors hoard solutions nobody owned.
The moment Dix decided to open-source the database instead of building another monitoring product was not a business strategy session. It was a conference observation. At Monitorama Berlin in September 2013, he saw engineers from different companies describing the same internal time series system they'd each built privately. Same problem. Same solution. Built in isolation. Treated as proprietary. His conclusion: something that every company is rebuilding independently is not intellectual property — it's a shared infrastructure problem waiting for someone to solve it publicly. He flew home, took five weeks, and open-sourced the database. The pivot from Errplane to InfluxDB was not a business pivot. It was a philosophical one.
4. InfluxDB was one of the projects that helped Go prove itself as a serious systems language.
When Dix chose Go in late 2013, the language was three years old and not obviously suited for production database infrastructure. The decision to write a high-performance, production-grade, open-source database in Go gave the language a flagship reference implementation. Engineers who read about InfluxDB's architecture in blog posts were reading their first serious Go code. InfluxDB's performance numbers were evidence that Go could compete with C++ in write-throughput-sensitive workloads. It became a proof of concept for the language itself — at a critical moment when Go needed exactly that kind of credibility to attract systems programmers away from C and C++.
5. The IOx rewrite put InfluxDB into open conflict with its most loyal users — and Dix did it anyway.
The four-year Apache Arrow/Parquet/Rust rewrite was not popular inside the community that had built production systems on InfluxDB 1.x and 2.x. It arrived on top of the Flux language fiasco. Users who had migrated from InfluxQL to Flux watched Flux get deprecated before IOx was even finished. Community forums filled with warnings to new users not to invest in the current platform. Some of the most experienced InfluxDB practitioners migrated to competitors. Dix knew this was happening and pressed forward anyway — on the conviction that the architectural bet was correct and that a database built on Parquet and Arrow would eventually be worth more than the community goodwill lost during the transition. The technical gamble appears to have paid off. The community cost was real.
| Data Point | Value |
|---|---|
| Founded | 2013 (Errplane → InfluxDB pivot) |
| First commit | September 26, 2013 |
| Language (v1/v2) | Go |
| Language (v3 / IOx) | Rust |
| Storage engines | LevelDB → BoltDB → TSM (3 rewrites) |
| The TICK Stack | Telegraf, InfluxDB, Chronograf, Kapacitor |
| Query languages | InfluxQL → Flux → SQL (3 pivots) |
| IOx announcement | November 2020 |
| InfluxDB 3.0 | 2023 |
| Open-source license (v3 Core) | MIT/Apache 2 (Jan 2025) |
| Total funding | $200M+ |
| Revenue (2024) | ~$75M |
| Market share (estimated) | 58% of time series category |
| Paul Dix's book | "Service Oriented Design with Ruby and Rails" (O'Reilly, 2010) |
| Tesla deployment | 1.5 TB/day, 50,000 sensors |
| CERN deployment | 3.4 TB/second |
The naming game — Paul Dix didn't invent the time series database. He invented the name for it. The category existed; it just had no label. He named it in a meetup talk in 2013 and two years later it was the fastest-growing segment in the database industry. This is what naming your category actually looks like: not a marketing exercise, but an engineer giving a word to something everyone was already building.
The book author who built the database — There's a specific kind of founder who writes technical books before they build companies. They think in systems. They explain things before they build them. Paul Dix wrote an O'Reilly book on SOA patterns in Ruby in 2010. Three years later he named a database category. The pattern is consistent: see a problem, understand it well enough to explain it, then build the thing that fixes it.
The Monitorama moment — The most important product decision Dix made wasn't an architecture choice or a funding round. It was flying to Berlin for a conference and noticing that everyone in the room had solved the same problem in private. One observation. One flight home. Five weeks of work. Category created.
Three storage engines, two pivots, one query language fiasco — InfluxDB's internal history is a master class in the cost of wrong bets. Three storage engine rewrites. Three query languages. A four-year architectural rebuild. The company is still standing because the underlying demand was real enough to absorb every mistake.
The Rust bet — In 2020, Dix decided to throw away the Go codebase and rebuild everything in Rust on top of Apache Arrow and Parquet. The community was not happy. The migration took four years. The resulting database is technically extraordinary. The question of whether the community cost was worth the architectural gain is still being answered.
CERN sends 3.4 terabytes per second — The same database that started as stripped-out infrastructure from a failed SaaS monitoring tool now receives particle physics telemetry from the Large Hadron Collider at 3.4 terabytes per second. And Tesla's factory floors. And NASA. The demand Paul Dix noticed at a Berlin conference in 2013 turned out to be one of the deepest horizontal problems in the entire software industry.
Sources: InfluxData official blog, Open Source Underdogs podcast (Ep. 28), Monitorama Berlin 2013 talk references, InfoQ IOx/Rust rewrite coverage, Paul Dix conference talks (Strange Loop, GopherCon), DB-Engines category rankings, Crunchbase/Pitchbook funding records, CERN and Tesla deployment case studies, InfluxDB documentation history (v0.9, v1.x, v2.x, v3.x), O'Reilly catalog (Service Oriented Design with Ruby and Rails, 2010), TechCrunch Series E coverage.