The Retroactive Analyst: The Origin Story of Heap

March 23, 2026

Ignacio Malpartida

GTM Engineer

Stacksync

The Retroactive Analyst: The Origin Story of Heap

Generated by Master Biographer | Source for LinkedIn Content

I. THE HOOK: The Thing You Forgot to Track

Picture a product team in the autumn of 2013. They have just discovered, through a support ticket that got escalated three times before anyone understood what it meant, that a checkout flow they shipped three months ago has a subtle but devastating defect. Not a bug exactly — the page loads, the button works, the transaction completes. But there is a step, a particular sequence of actions that a certain kind of user takes when they arrive from mobile, that produces an entirely different experience than the one the team intended. The conversion rate for those users is half what it should be.

The question is simple: how long has this been happening? How many users hit this sequence? Where did they come from? What did they do before this moment and what did they do — or not do — after?

The team goes to their analytics platform. They have Mixpanel. They have defined their events carefully. They have funnels and cohorts and retention curves. They have, by any reasonable standard, a mature analytics setup.

But they did not think to instrument this particular sequence three months ago. Nobody thought to. It was a transition state, a micro-interaction in the middle of a larger flow, the kind of thing that only becomes meaningful once you discover it has been quietly destroying your conversion rate.

So the data does not exist.

The team can tell Mixpanel to start tracking this interaction today. The data will begin to accumulate. In six weeks, they will have enough to understand what is happening. In three months, they might have enough to know whether the fix they shipped actually worked.

But the three months before today — the three months during which this problem has been eating their revenue — those months are gone. That data was never captured. It cannot be recovered. You cannot go back in time.

This is the problem Heap was built to solve.

II. THE BACKSTORY: Two MIT Graduates and an Uncomfortable Question

Matin Movassate and Ravi Parikh met at MIT. They graduated and went their separate ways into the tech world, accumulating the kind of experience that eventually produces a startup: Movassate through product management at Facebook, where he watched enormous volumes of user data being analyzed and understood; Parikh through Palantir, the data analytics and intelligence company that had, by 2012, become one of the most technically sophisticated data organizations in the world.

Palantir was a particular kind of education. It was a company obsessed with making sense of vast, heterogeneous data — connecting datasets that were never designed to talk to each other, finding patterns across billions of records, making the invisible visible. Ravi Parikh worked there alongside another engineer who would later found a company competing directly with Heap: Spenser Skates, who would go on to co-found Amplitude in 2012. Two engineers at the same data-obsessed firm, shipping products that would one day occupy opposite corners of the same market. The analytics world is smaller than it looks.

Movassate and Parikh reunited in 2013 with a shared obsession that was becoming impossible to ignore. They had both spent years inside organizations that cared deeply about user behavior data. They had both watched product teams make decisions about their software based on incomplete, delayed, or retrospectively worthless information. They had both felt the specific frustration of the checkout team scenario above — not that version exactly, but the pattern: the thing you needed to know existed in the data, but the data had not been collected because nobody had told the system to collect it.

The question they kept returning to was: why does it have to work this way?

The dominant model in 2013 was what Mixpanel had pioneered: define your events upfront, instrument them carefully, and then analyze. This was, in its time, a revolutionary improvement over Google Analytics — you were measuring behavior rather than page views, actions rather than sessions. The quality of the thinking embedded in it was real. But it contained a structural assumption that nobody had questioned: you have to decide what to track before you can track it.

What if you didn't?

What if the system captured everything — every click, every tap, every page view, every form fill, every scroll — automatically, from the moment the snippet was installed? What if the data existed before you knew you needed it? What if, instead of defining your questions before your data collection, you could collect everything and then ask whatever questions you wanted afterward — including questions you would not think to ask for another three months?

This was the Heap thesis. It was simple, radical, and expensive. Capturing everything means storing everything. And storing everything, at the scale of real web traffic, is not a trivial engineering challenge. It would become Heap's defining technical problem for the next several years.

They applied to Y Combinator in the summer of 2013 — YC's batch records indicate this as a Summer 2013 cohort, though some sources describe the company as Winter 2013 based on the founding year — and got in. The pitch was not complicated. It was the checkout scenario, stripped to its essence: what happens when a product team discovers a problem they forgot to instrument? With Heap, the answer is: nothing happens. The data was already there. You just didn't know you needed it yet.

YC said yes.

III. THE GRIND: Building Infrastructure for the Unknown Question

The early days of Heap were not about building a product. They were about building infrastructure.

The "capture everything" thesis sounds simple until you think about what it actually requires. A modern web application might generate thousands of discrete user interaction events per session. Across ten thousand daily active users, that is tens of millions of events every day. Across a million users, the math becomes almost incomprehensible. Every page view, every mouse click, every form interaction, every scroll position, every hover, every keystroke in a form field — stored, indexed, and made queryable in real time.

The storage implications were staggering. Mixpanel's model was efficient because it was selective: you told Mixpanel exactly which events to capture, and it only stored those. The footprint was manageable because the scope was predefined. Heap's model had no predefined scope. The scope was everything.

The engineering challenge was building a system that could capture this volume without degrading the performance of the sites it was installed on, store it without costs that would make the business economically impossible to run, and make it queryable in ways that were fast enough to be useful. These were three separate, interrelated hard problems, and solving them simultaneously against a competitive market backdrop and limited runway is the kind of pressure that either produces brilliant engineering or breaks teams.

Heap's team did not break.

But the sales objection was as hard as the technical problem. When Heap went to market in 2013 with its $2 million seed round, the most common reaction from potential customers was not excitement about retroactive analysis. It was anxiety about data. "You're capturing everything? What does that mean for our users' privacy? What does that mean for our storage costs? What does that mean for the signal-to-noise ratio in our analytics? Isn't most of what users do irrelevant?"

These were fair questions. The "capture everything" approach, taken naively, produces an ocean of events with no structure — a firehose pointed at a bucket. The product insight that made Heap workable was not just the capture layer; it was the organizational layer on top. Heap let teams define and name events retroactively. You could install the snippet today, let it run for three months capturing everything, and then — after you discovered your checkout problem — go back and define the event "mobile-user-enters-checkout-via-campaign-X" and immediately see every instance of it across the entire three months of captured data.

This is what Heap called a retroactive dataset. And it was the product's killer feature — not because it was the most sophisticated thing in the analytics market, but because it solved the most emotionally resonant pain point: the thing you forgot to track.

The first significant external validation came quickly. In August 2013, Heap raised its $2 million seed round. In June 2014, the product expanded to iOS apps — extending the "capture everything" philosophy to mobile, where the event-definition burden on developers was, if anything, even heavier than on the web. In 2016, Heap raised $11 million in its Series A. By 2017, a $27 million Series B. By 2019, a $55 million Series C led by NewView Capital.

The company grew from a founding insight into a market-defining product, and each funding round expanded its ability to solve the infrastructure problem at the center of its thesis.

IV. THE BREAKTHROUGH: Time Travel, Productized

The feature that made Heap's philosophy concrete — that turned "capture everything" from a technical curiosity into a competitive moat — was what the analytics industry started calling retroactive analysis. Inside Heap, and among its power users, it had a more evocative name: time travel.

The concept is simple enough to explain to a non-technical executive in one sentence: because we captured everything, you can analyze behavior that happened before you knew you wanted to track it.

The emotional weight of that sentence, for anyone who has ever sat in a product review meeting staring at a gap in their data, is enormous.

Imagine a growth team at a SaaS company. They are in the middle of an A/B test. Three weeks into the test, one variant is clearly winning — not just on the primary metric but on a secondary pattern that nobody had anticipated: users who see this variant are completing a specific in-app action at twice the rate of the control group. This action was not in the original hypothesis. Nobody thought to track it. But because Heap was running, and because Heap had been capturing everything for eighteen months, the data exists. The team can go back to day one of the test, day one of Heap's deployment, and analyze this behavior as if they had predicted it all along.

This is retroactive analysis. This is time travel.

The competitors — Mixpanel, Amplitude, Google Analytics — could not do this. Not because they lacked the engineering capability, but because they had made a different philosophical bet: define first, capture second. Their architecture was built on the assumption that you know what you want to measure. Heap's architecture was built on the assumption that you don't — and won't, until you're staring at a problem that the data already contains the answer to.

The consequence was not merely product differentiation. It was a different relationship between analytics teams and their data. Heap users stopped thinking of their analytics setup as a configuration problem — something to be maintained and updated as new features shipped. They started thinking of it as an archive. The data was accumulating, continuously and automatically, and the work was not in collecting it but in interrogating it.

This changed how teams hired, how they structured their analytics functions, and how they thought about the cost of curiosity. In a Mixpanel world, every new question potentially required an engineer to add an event, deploy the change, and wait for data to accumulate. In a Heap world, the cost of a new question was the time to ask it. The data was already there.

V. THE AFTERMATH: Validation, Imitation, and a French Acquisition

The clearest sign that Heap had won an argument is what its competitors eventually did.

By the late 2010s, Mixpanel and Amplitude — the two platforms that had defined the manual-instrumentation philosophy, that had built their products and their sales pitches and their developer documentation on the premise that instrumentation was a feature, not a burden — both added autocapture functionality.

Mixpanel launched an autocapture feature. Amplitude launched Amplitude Autocapture. PostHog, the open-source analytics platform, built autocapture into its core architecture from day one.

The framing varied. Mixpanel positioned it as a complement to manual instrumentation — a safety net for events you might have missed. Amplitude positioned it as a way to accelerate time-to-insight. PostHog positioned it as democratizing access to product data. But the underlying admission was the same: Heap had been right that capturing everything was valuable, and the market had spoken loudly enough that nobody could afford to ignore it anymore.

This is a particular kind of validation — the kind where your competitors copy your thesis rather than your product. It means the idea was right. It also means the moat is narrower than it looks, because if your philosophy is replicable by the players who had the most to lose from it, the competitive advantage shifts from the idea to the execution.

Heap's response to this competitive pressure was to move upmarket — away from the self-serve analytics buyer and toward the enterprise, where the depth of the retroactive dataset, the breadth of the capture layer, and the integration capabilities with data warehouses and CDPs were more differentiated than a simple autocapture checkbox. This was not entirely dissimilar to the move Mixpanel had made in the mid-2010s — and it carried some of the same risks.

In September 2023, Contentsquare — a French digital experience analytics company, founded in 2012 by Jonathan Cherki, which had grown into one of the most well-funded companies in European tech — announced the acquisition of Heap. The deal was valued at over $200 million.

Contentsquare's core product analyzed how users physically interacted with digital surfaces — where they hovered, how far they scrolled, where their attention went, what frustrated them. It was, in some ways, the visual layer of what Heap captured in the behavioral layer. Contentsquare could tell you that users weren't scrolling past the fold on a particular page; Heap could tell you that users who didn't scroll past that fold had a dramatically lower conversion rate six weeks later. Together, the pitch was a complete picture: not just what users did, but how they moved, where they looked, and what the downstream consequences of their patterns were.

The acquisition represented a particular kind of European ambition — a French company, backed by the Softbank Vision Fund and others, using capital to acquire American analytics talent and technology as part of a bid to build a global digital analytics platform that could compete with the Googles and Adobes of the world. Heap, at $200 million, was not a small bet.

For Matin Movassate and the team he had built over a decade, the outcome was a validation of a different sort than the competitor imitation: financial, institutional, and strategic. The company that had started with a $2 million seed round arguing that the analytics industry had its philosophy backwards had become, ten years later, a $200 million asset.

The checkout team that discovered their conversion problem would never have to lose those three months of data again. That, in the end, is what the decade was for.

5 THINGS NOBODY KNOWS ABOUT HEAP

1. The retroactive analysis feature is not actually "time travel" in the way most people imagine — it's more radical than that.
When people hear "retroactive analysis," they picture a product that lets you go back and query past data. That's true but incomplete. What Heap actually built was a system where event definitions themselves are retroactive: you can name and define an event today — call it "high-value-checkout-sequence" — and Heap will instantaneously apply that definition to every interaction it has captured since the snippet was installed, which might be months or years of historical data. You're not traveling back in time to observe behavior. You're inserting a new concept into a complete historical record, and watching it illuminate immediately. This is a fundamentally different data architecture than anything Mixpanel or Amplitude had built.

2. The storage infrastructure challenge was Heap's deepest moat — and nobody talked about it.
The product debate between autocapture and manual instrumentation was always framed as a philosophy debate: should you decide what to track upfront? But the real debate was engineering economics. Capturing every click and tap across every user session, storing it durably, indexing it for real-time query, and doing this at enterprise scale without costs that destroy the unit economics of the business — that is an infrastructure problem of a completely different order than what Mixpanel faced. Heap's years of solving this problem were not just R&D; they were the construction of a technical capability that a competitor cannot replicate by simply checking a "capture everything" box.

3. Ravi Parikh worked at Palantir at the same time as Spenser Skates, who would go on to found Amplitude.
Two engineers at one of Silicon Valley's most data-intensive firms, both of whom would leave to build analytics companies, both of whom would end up occupying adjacent positions in the same market. Parikh built Heap; Skates built Amplitude. The companies are philosophically opposite in how they approach data capture. Whether Palantir's culture of total data integration influenced Parikh's autocapture thesis — the idea that you capture the full data environment and then ask questions — is speculative, but the parallel is striking.

4. When Mixpanel and Amplitude added autocapture, Heap had already won the argument but was about to lose the feature advantage.
The imitation was a form of market validation: the philosophy had been proven. But it also compressed Heap's differentiation from "we have autocapture and they don't" to "we have ten years of autocapture infrastructure and they have a checkbox." That is a harder sales story at the top of the market. The pivot upmarket and the eventual Contentsquare acquisition are both, in part, responses to this compression: Heap needed to be about more than autocapture once autocapture became a commodity feature.

5. The Contentsquare acquisition was not a landing, it was a launch — but into a different orbit than the one Heap's founders had originally planned.
Contentsquare's ambition was not to absorb Heap and run it as a product analytics tool. It was to build a complete digital experience intelligence platform by combining Contentsquare's visual behavior layer (heatmaps, session replay, zone-based analysis) with Heap's behavioral data layer (event sequences, funnels, retention, product analytics). The combined pitch — understand not just what users did but how they moved, where they looked, and what the behavioral consequences were downstream — is a category that did not fully exist before the acquisition. Heap did not exit into a larger acquirer's product portfolio. It became the engine of a new category.

Sources: Y Combinator company profile, TechCrunch funding coverage (2013–2019), Heap product documentation (autocapture, retroactive datasets), Crunchbase funding records, Contentsquare acquisition announcement, Heap About page, product analytics industry analysis.

Ready to see a real-time data integration platform in action? Book a demo with real engineers and discover how Stacksync brings together two-way sync, workflow automation, EDI, managed event queues, and built-in monitoring to keep your CRM, ERP, and databases aligned in real time without batch jobs or brittle integrations.

→ FAQS

The Retroactive Analyst: The Origin Story of Heap

The Retroactive Analyst: The Origin Story of Heap

I. THE HOOK: The Thing You Forgot to Track

II. THE BACKSTORY: Two MIT Graduates and an Uncomfortable Question

III. THE GRIND: Building Infrastructure for the Unknown Question

IV. THE BREAKTHROUGH: Time Travel, Productized

V. THE AFTERMATH: Validation, Imitation, and a French Acquisition

5 THINGS NOBODY KNOWS ABOUT HEAP

Syncing data at scale
across all industries.

Alex Marinov

The only integration cloud built for real time

Syncing data at scale across all industries.

Alex Marinov

Syncing data at scale
across all industries.