February – A Unique Introduction

It was late February, and I had spent the past several weeks learning everything I could about the distributed database ecosystem. As an intern on the Investment Team at 8VC, I had gotten the chance to work on technical due diligence under Partner & CTO Bhaskar Ghosh, taking a multitude of pitches every week and constantly shifting contexts. One of those companies speaking with us was Yugabyte, and I was soon consumed by the world of NewSQL – battling architectures (Spanner v. Calvin), language variants (Postgres v. MySQL), and cloud providers. Thanks to Bhaskar sharing his wealth of operational knowledge and experience in the space, I found myself having just enough high-level knowledge to understand the conversations we were having with co-founders Karthik and Kannan. It quickly became clear to me that they are world-class engineers and entrepreneurs, as well as some of the few people with the background and grit to make a company in this space succeed. Part of that belief came from the stories I heard about their time at Oracle or leading HBase at Facebook, but their real strengths and passion came across in our conversations about competitive landscape, go-to-market strategy, and bottoms-up developer evangelism

Over the next couple weeks, many long write-ups were penned on the future of open source databases, growing OLTP budgets, and the value of transactional consistency at scale. Even all water-cooler conversation began to center around Yugabyte. My other work fell into the backdrop as we burnt the midnight oil to make the partnership happen, and by the time the process had come to a close, I had decided that I wanted to spend some time hands-on engineering at Yugabyte.

April – Interviewing at Yugabyte

I hadn’t written much code during my time at 8VC so I spent some time freshening up on the work I’d done preparing for interviews eight months prior. I was initially concerned about database implementation related questions (as my limited knowledge in the space was purely self-taught) but those worries were soon proven to be unfounded as the interviews covered your typical data structures and algorithms.

To me, this was a great opportunity to get to know two of the engineering leaders at Yugabyte and get a better understanding of the organizational structure of the engineering team. I soon learned that the structure of the engineering organization mirrored that of Yugabyte’s architecture, with the additional flexibility of being able to do work across layers.

My first interviewer, Bogdan, leads the DocDB team at Yugabyte. For those who may be less familiar, DocDB is the core data storage layer that is responsible for ensuring transactional consistency, sharding tables, guaranteeing high performance, and replicating data. It is a distributed key-value store that heavily modifies and extends the popular RocksDB storage engine out of Facebook. This conversation reminded me of many of the reasons why I was initially so intrigued by what Yugabyte had built – qualities such as auto-scaling of deployments and ensuring ACID transactions at scale. These are difficult guarantees in a traditional leader-follower setting, let alone a distributed one.

My second interview was with Neha, who leads the YQL team at Yugabyte which is responsible for everything query layer related – including planning, pushdowns and other optimizations, and execution. Yugabyte has been able to adopt the entire top-half of PostgreSQL into their codebase, and that’s part of what makes it so interesting. Not only do they get Postgres compatibility “out of the box,” but they also have a pluggable query layer with YCQL (Cassandra-compatible), YSQL (Postgres-compatible), and YEDIS (REDIS-compatible – not in active development). This allows Yugabyte to inject into multiple different workloads and markets and attract users from both the SQL and NoSQL worlds – another core strength that I had identified a few months before.

YugabyteDB architecture query layer postgres compatible and distributed doc store layer

Outside of the core, fully open source database, there is surplus of other work to be done. Yugabyte has a team responsible for both the self-managed DBaaS platform (Yugabyte Platform) and the fully-managed solution (Yugabyte Cloud, currently in Beta). This touched on another important secular movement – as enterprises move to the cloud, it’s vital that software vendors stay cloud-agnostic and support a variety of deployments, including those that are hybrid and/or multi-cloud. It was cool to see Yugabyte help prevent cloud vendor lock-in and open themselves up to a wide potential customer-base by supporting deployments on GCP, Azure, and AWS. In my time at Yugabyte, I felt that it was important to learn about Kubernetes deployments and deployments in the public clouds, even if I didn’t get the chance to touch any code in the space.

#databases #distributed sql #open source #postgresql #internship

My Time as a Yugabyte Software Engineering Intern
20.10 GEEK