Our Investment in ReadySet

By Lenny Pruss and Natalie Vais

Nobody wants to use a slow app. And while this sentiment isn’t new, it has (in many ways) never been harder to actually build a highly performant application. Apps are getting more and more dynamic, with users around the globe making requests to highly interactive UIs at all times of the day and night. If you’re a developer tasked with improving application performance, you’re probably not going to be loving your life.

When it comes to scaling throughput and optimizing latency for applications, the data tier is most often the bottleneck. Databases lie at the heart of virtually every system and manage the most precious part of your application. Moreover, data has gravity (if you haven’t heard), which makes changes particularly painful in this layer of the stack.

There are essentially three ways to scale your data tier:

  1. Scale up (increase capacity of existing machines)
  2. Scale out (replicate or distribute data across more machines)
  3. Optimize performance of existing system components (ex: rewrite, tune, cache, etc.)

Caching is one of the best-known techniques to reduce the load on your data tier and improve scalability and performance. Building a cache essentially means temporarily storing some data – whether that’s an image, video, or SQL query – close to your users; although this is a great oversimplification as the developers reading this can attest. Regardless, caching has become commonplace, so much so that any and every application of reasonable scale relies on caching infrastructure.

Done right, caching is one of the most powerful tools in building scalable, distributed data systems. However, as many platform teams can testify, it’s also one of the most burdensome and error-prone — software engineers frequently underestimate the complexity and quickly become entangled in a messy web of custom logic and legacy self-hosted systems. In fact, caching has put such a strain on engineering teams that a common adage has emerged in developer circles:

“There are only two hard things in Computer Science: cache invalidation and naming things.”

Teams typically implement caching in at least one of three ways [see the graphic below]:

  1. leverage an in-memory database cache (like Redis or Memcached),
  2. replicate the database entirely to create a read-only version for your web application known as a ‘read replica’, or
  3. write complex ad-hoc caching logic directly into the application.

Not only are these approaches time-consuming, but they are costly — teams incur significant engineering overhead and maintenance costs as a result of the operational complexity. Meanwhile, applications become brittle from caching failures resulting in hours upon hours of downtime.

There has to be a better way!

In 2019, we came across a research paper out of MIT introducing a new methodology for scaling and optimizing web applications that captured our imagination. This project, called Noria, hinges on a core innovation around partially stateful dataflow that adapts on-the-fly in response to query changes. Noria’s dataflow engine keeps only a subset of state in memory, fetching missing data on-demand so your database views are always up to date. This allows for a small memory footprint and blazingly fast performance. Noria seeks to provide incrementally updated materialized views for your application; in essence, it puts your caching layer on autopilot.

We then were fortunate to get in contact with Alana Marzoev and Jon Gjengset, two of the key contributors to Noria. We were blown away by their technical insight, maturity, and vision. We closely followed their progress and in the early fall of 2020, Amplify made a small initial investment in the team to form ReadySet. After a few short months of working alongside them and being extremely impressed with the traction and execution of the ReadySet team, we quickly followed with a full Series Seed investment in early 2021.

ReadySet now seeks to build on Noria’s foundation, fully removing the need for developers to write ad-hoc caching logic or maintain costly in-memory cache systems. ReadySet provides a data distribution layer that plugs directly into your database. Companies can seamlessly scale out reads for applications and do more with limited resources due to ReadySet’s highly efficient architecture.

Ultimately, you can think of ReadySet as a CDN for your database – a distributed SQL caching layer for the internet – that fundamentally changes the economics of building always-on, fast, global applications.

So today, we are thrilled to announce our investment and participation in ReadySet’s Series A, led by our friends at Index Ventures. We believe ReadySet is building a foundational new layer in the web infra stack and we couldn’t be more excited to be part of their journey.

Welcome (officially) to the Amplify family!