The Recipe Search Engine: The Origin Story of Elasticsearch

London. Winter, 2004. Shay Banon has nowhere to be. His wife has just started classes at Le Cordon Bleu, the legendary French culinary school with a campus in the heart of the city. She leaves in the morning with her chef's whites, comes home in the evening smelling of butter and stock reductions, and brings with her a growing collection of recipes — handwritten notes, photocopied pages, torn magazine clippings, printed PDFs. Classic French technique. Modern European variations.
Blog post featured image

The Recipe Search Engine: The Origin Story of Elasticsearch

Generated by Master Biographer | Source for LinkedIn Content


I. THE HOOK: London, 2004. A Bored Developer. A Stack of Recipes.

London. Winter, 2004.

Shay Banon has nowhere to be.

His wife has just started classes at Le Cordon Bleu, the legendary French culinary school with a campus in the heart of the city. She leaves in the morning with her chef's whites, comes home in the evening smelling of butter and stock reductions, and brings with her a growing collection of recipes — handwritten notes, photocopied pages, torn magazine clippings, printed PDFs. Classic French technique. Modern European variations. Dishes she's adapting, refining, annotating.

The recipes multiply. The organization does not.

Shay is an Israeli software developer in his mid-twenties, living in a city that is not his own, in a country whose language his wife is mastering. He is unemployed — not in a tragic sense, but in the manner of someone between things, with time and a laptop and the kind of unstructured hours that either produce nothing or produce something unexpected. He has no job to go to. He has Apache Lucene documentation to read.

He decides to build his wife a search engine for her recipes.

This is not a startup idea. It is not a pitch. It is not a product vision. It is a man, at a kitchen table probably, trying to help his wife find her coq au vin notes without digging through three notebooks and a shoebox.

What comes out of that afternoon — out of those weeks of tinkering, of reading Lucene docs, of building something for no audience of one except the person who comes home from cooking school — will become the search and analytics backbone for Twitter, GitHub, Wikimedia, Netflix, Uber, and thousands of other companies. It will power the logging infrastructure of the modern internet. It will run the security operations centers of governments and banks. It will become, by any measure, the dominant open-source search engine on the planet.

Shay Banon didn't know any of that. He was just trying to organize some recipes.


II. THE BACKSTORY: An Israeli Developer Alone With Lucene Documentation

To understand what Shay Banon built, you have to understand what Apache Lucene was in 2004.

Lucene is a Java library. A search library. It was created by Doug Cutting — the same person who would later create Hadoop — and donated to the Apache Software Foundation in 2001. Lucene gives you the building blocks for full-text search: an inverted index, a query parser, a scoring engine. It is powerful, low-level, and demanding. It does not hold your hand. It does not come with an interface. It is infrastructure in the purest sense — the kind of thing you build on top of, not the kind of thing most people use directly.

Most people, if they wanted to add search functionality to a project, would use something packaged, something simpler, something that didn't require them to understand the mechanics of how inverted indices work. Shay did not do that. He read the documentation. He learned how Lucene indexed text, how it analyzed and tokenized content, how queries were translated into searches across the inverted index. He learned how to work with it, not around it.

The result of that early tinkering was something he called Compass — a Java library designed to make Lucene easier to use. A wrapper, essentially, but a thoughtful one. Compass didn't hide Lucene. It made Lucene accessible. It handled the boilerplate, managed the session lifecycle, simplified the integration with Java applications. If Lucene was the raw engine, Compass was the gearbox — the thing that made the engine usable without requiring you to be an engine designer.

Compass was good. Developers started using it. It wasn't famous, but it had users, and users meant feedback, and feedback meant Shay was learning something that no amount of documentation could teach him: what people actually needed from search, as opposed to what search engines assumed they needed.

What they needed, it turned out, was not a better Java library.

What they needed was something that didn't exist yet.


III. THE GRIND: Rebuilding Everything From Scratch

By 2009, Shay had been living with Lucene for five years. He had been running Compass through its paces, watching how it broke, learning its edges, understanding what the library could not do no matter how cleanly you wrapped it.

The fundamental problem with Compass — with any Lucene wrapper — was architectural. Lucene is not distributed. Lucene is a single-node, single-process library. It runs on one machine. It indexes data on one machine. When your data outgrows that machine, or when your queries need to run faster than one machine allows, Lucene has no answer for you. You are on your own.

The internet of 2009 was not a single-machine problem. Twitter had just launched its firehose. GitHub was growing fast enough that its search infrastructure was already showing strain. Companies were sitting on data sets that no single server could hold. The question everyone in the industry was asking — in different ways, from different angles — was: how do you make search work at scale?

Shay knew the answer involved throwing Compass away.

Not improving it. Not extending it. Throwing it away and rebuilding from the ground up, this time with distribution as the first principle rather than an afterthought.

He wrote the first lines of code for Elasticsearch in 2009. The name itself was part of the vision: a search engine that could scale elastically — add nodes when you needed capacity, remove them when you didn't, distribute the index across a cluster automatically without requiring the operator to manage it manually. It would speak JSON over HTTP. It would have no installation ceremony. You would unzip it and run it, and it would work.

These were not small design choices. Each one was a deliberate rejection of how enterprise search software had always been built. Enterprise search required dedicated administrators. It required proprietary query languages. It required you to understand the internal architecture before you could do anything useful. Elasticsearch assumed you shouldn't have to. It assumed that if search was going to be useful to developers, it needed to feel as simple as an API call.

The first public version launched on February 8, 2010. Shay opened the GitHub repository. He wrote a short README. The tagline, which he would carry forward into everything Elastic would become, was five words: You know, for search.

It was casual to the point of absurdity. It was a phrase someone uses when they're explaining something obvious. It was, depending on how you read it, either the most relaxed product launch in software history or the most self-assured — as if the whole exercise were simply self-evident, as if the need was so clear that the tagline barely needed to be a tagline at all.


IV. THE BREAKTHROUGH: "You Know, For Search"

The open-source release landed on Hacker News. The response was not indifferent.

Developers who had been wrestling with Lucene, who had tried Solr (the other major Lucene-based search engine, which required significantly more configuration and operational overhead), who had been building custom search solutions that satisfied nobody — these developers looked at Elasticsearch and felt something that is rare in infrastructure software: relief.

You unzipped it. You ran it. You pointed it at data. It indexed. It searched. The results came back in JSON, the same format everything in the web ecosystem already spoke. There was no XML ceremony. There was no schema definition requirement. You could throw semi-structured data at it and it would figure out what to index.

The distributed layer was real, not marketing. Elasticsearch used a consensus-based shard allocation system that let a cluster automatically balance its index across multiple nodes. Adding a node was not a migration event. The cluster reorganized itself. Removing a node was not a failure. The cluster redistributed the shards that node had been holding. It was genuinely elastic in a way that the name promised.

The developer community started using Elasticsearch for things Shay had not anticipated.

The recipe search engine had become a logging engine. Developers were pointing log aggregators at it, indexing server logs, querying them in near-real-time to diagnose production incidents. A search engine for text was, it turned out, an excellent tool for searching through log lines — because log lines are text, because production problems require fast queries, because the distributed architecture meant you could index enormous volumes of log data across a cluster without any single machine becoming a bottleneck.

GitHub moved its code search to Elasticsearch. The ability to search across millions of repositories, in real-time, for a specific function call or a specific string — this was a hard problem, and Elasticsearch solved it.

Twitter used it for search over the firehose. Wikipedia used it to power search across all of its language editions. SoundCloud indexed its audio metadata through it. Stack Overflow ran its question search on it.

The recipe search engine was powering the internet.


V. THE CO-FOUNDERS: Amsterdam, 2012

By 2011, Shay understood that Elasticsearch needed to be more than a project. It needed a company.

Not because he wanted a company. But because the software was being used at scales that required more than one person to support, and because the users who were building their businesses on top of it needed more than a GitHub repository and a mailing list.

The people he found were not accidental.

Steven Schuurman was a Dutch entrepreneur with deep enterprise software experience, someone who understood how to build a company, how to sell software, how to take a developer tool and make it into something that a procurement department would write a check for. He would become CEO.

Uri Boness was a Dutch software engineer who had been close to the Compass project, who understood the codebase at the level required to help build it into something commercial.

Simon Willnauer was an Apache Lucene core committer — someone who had been contributing to the very library that Elasticsearch was built on since 2006. He brought with him not just deep technical knowledge of how Lucene worked internally, but also standing in the Apache community, relationships with the people who maintained the foundations Elasticsearch depended on.

The four of them incorporated Elasticsearch BV in Amsterdam in 2012. The choice of Amsterdam was not incidental — it reflected the genuinely international nature of the company from day one. Shay was Israeli. Steven and Uri were Dutch. Simon was German. The company had users across every continent. It made more sense to exist in Europe than to perform the ritual of moving to San Francisco.

The Series A came quickly. Benchmark and NEA led a $10 million round. The announcement validated what the developer community already knew: Elasticsearch was not a toy. It was infrastructure.


VI. THE ELK STACK: An Acronym That Became an Industry Standard

A search engine is only as useful as the data you put into it.

This was the problem that gave rise to Logstash.

Jordan Sissel had been building Logstash as a side project — an open-source log collection and processing pipeline that could take data from almost any source, transform it, and ship it to an output. He had built it for himself, to solve his own log management problems, and released it because that is what you do when you work in the open-source ecosystem. It was powerful, extensible, and free.

The natural output for Logstash was Elasticsearch. The combination was obvious: Logstash collects and parses your logs, Elasticsearch indexes and stores them, and you can query them in real-time. The stack solved a problem that every company with servers had: how do you make sense of what's happening in your infrastructure?

In 2013, Elastic acquired Logstash. Jordan Sissel joined the company.

But a search engine and an ingestion pipeline still left something missing: a way to look at the data visually, to build dashboards, to turn log lines and metrics into the kind of visualization that a human could look at during an incident and immediately understand.

That was Kibana. Rashid Khan built it as an open-source project specifically designed to visualize data stored in Elasticsearch. An analytics dashboard on top of a search engine. The same year Logstash joined the company, Elastic also acquired Kibana.

The ELK Stack — Elasticsearch, Logstash, Kibana — became the standard logging and observability infrastructure for the modern web. If you ran servers, you probably ran the ELK stack. If you had a security operations center, you probably ran the ELK stack. If you needed to understand what was happening inside your infrastructure in real-time, you ran the ELK stack.

Every major cloud provider eventually built managed versions. AWS offered it. Azure offered it. Google Cloud offered it. Companies that had once built their own logging infrastructure replaced it with the ELK stack because the ELK stack was better, and it was free, and it was already where the developer community had gathered.


VII. THE AFTERMATH: From Recipes to a $15B Public Company

Elastic filed for its IPO in September 2018. The ticker: ESTC. The exchange: the New York Stock Exchange.

The prospectus told a story that was, by then, hard to argue with. Elasticsearch was running on 325,000 servers across the world. It was being downloaded millions of times per month. The ELK stack was the de facto standard for log management and observability at scale. Companies like Netflix, Uber, Goldman Sachs, and the US Department of Defense were running their operational infrastructure on software that had started as a recipe search engine built in a London flat.

The stock opened at $70 per share, well above the offering price of $36. The market capitalization on day one exceeded $4.5 billion. At its peak, Elastic would be valued at over $15 billion.

Shay Banon's wife, if she has ever looked up what became of that recipe app he was building while she learned to make boeuf bourguignon, has presumably had a moment of quiet astonishment.

The software that indexes her recipes also indexes the logs of the systems that power Netflix's recommendation engine. Also indexes the threat intelligence data that security teams use to detect intrusions in real-time. Also powers the site search on thousands of e-commerce sites, the code search on GitHub, the full-text search across millions of Wikipedia articles in dozens of languages.

This is the unlikely trajectory of infrastructure software: it starts as something personal and small, built for one person and one problem, and if the abstraction is right — if the tool solves the right class of problem at the right level of generality — it escapes its origin completely. The recipe app becomes a category. The category becomes the default. The default becomes the assumption.

Shay Banon once explained his path into computers by noting that he used to count job postings in newspapers — watching which skills were in demand, trying to triangulate the future. It is a method that looks at the world not through a lens of what exists, but through a lens of what is needed.

He found a need in a London kitchen. He built for that kitchen. The need, it turned out, was everywhere.


TIMELINE REFERENCE

Year Event
2004 Shay Banon, unemployed in London, builds a recipe search engine for his wife using Apache Lucene
2004-2008 Compass — a Java library to simplify Lucene — released and developed publicly
2009 Shay writes the first lines of Elasticsearch code; rebuilds from scratch with distribution as core design principle
Feb 8, 2010 Elasticsearch open-sourced on GitHub; tagline: "You know, for search"
2012 Elasticsearch BV incorporated in Amsterdam by Shay Banon, Steven Schuurman, Uri Boness, Simon Willnauer
2012 Series A: $10M from Benchmark and NEA
2013 Acquire Logstash (Jordan Sissel) and Kibana (Rashid Khan); ELK Stack named
2014 Rebranded as Elastic; $70M Series C
Oct 2018 NYSE IPO (ESTC); opens at $70/share, $4.5B+ market cap
Peak Valued at over $15 billion
2024 Elasticsearch returns to open source (AGPL) after licensing dispute with AWS

KEY CHARACTERS

Shay Banon — Israeli developer, creator of Compass and Elasticsearch, Founder and CTO of Elastic. Built the first version for his wife while unemployed in London. GitHub handle: kimchy.

Steven Schuurman — Dutch entrepreneur, co-founder and CEO of Elastic. The person who helped commercialize what Shay had built.

Uri Boness — Dutch software engineer, co-founder of Elastic. Close to the Compass codebase from early days.

Simon Willnauer — German engineer, Apache Lucene core committer since 2006, co-founder of Elastic. Brought Lucene credibility and community standing to the founding team.

Jordan Sissel — Creator of Logstash; joined Elastic via acquisition in 2013. The 'L' in ELK.

Rashid Khan — Creator of Kibana; joined Elastic via acquisition in 2013. The 'K' in ELK.

Doug Cutting — Created Apache Lucene in 2000 (later donated to Apache). The foundation everything was built on.


CONTENT ANGLES FOR LINKEDIN

For Ruben / Operator lens:
- The "started for one person, scaled to millions" arc — how the best infrastructure is built from a personal need, not a market map
- The ELK Stack story as a masterclass in ecosystem building: acquire the pipelines, not just the engine
- Open source as a distribution strategy: Elasticsearch got to scale faster because it was free

For Alexis / CTO lens:
- The Compass-to-Elasticsearch decision: when you throw away your own code to build what the problem actually requires
- REST + JSON as a design philosophy: Elasticsearch won partly because it was easier to integrate than Solr
- Distributed-first architecture: the technical bet that made everything else possible

For Nacho / GTM lens:
- The tagline "You know, for search" as positioning genius — confidence disguised as casualness
- The ELK Stack becoming a standard without a sales team: what that distribution model looked like
- The IPO story: from a recipe app to ESTC on the NYSE

Competitor reference:
- Solr (also Lucene-based, older, more configuration-heavy — Elasticsearch won by being simpler to start)
- OpenSearch (AWS fork after the licensing dispute; the cautionary tale about what happens when hyperscalers build on your open source)

Ready to see a real-time data integration platform in action? Book a demo with real engineers and discover how Stacksync brings together two-way sync, workflow automation, EDI, managed event queues, and built-in monitoring to keep your CRM, ERP, and databases aligned in real time without batch jobs or brittle integrations.
→  FAQS

Syncing data at scale
across all industries.

a blue checkmark icon
POC from integration engineers
a blue checkmark icon
Two-way, Real-time sync
a blue checkmark icon
Workflow automation
a blue checkmark icon
White-glove onboarding
“We’ve been using Stacksync across 4 different projects and can’t imagine working without it.”

Alex Marinov

VP Technology, Acertus Delivers
Vehicle logistics powered by technology