Cache-it – Episode #2 – Indexing adventures in the age of embeddings: Building a world-class search system
Discover various index methods used in modern search engines.
Summary
In episode 2 of the Cache-it podcast, Khawaja invites Manju Rajashkhar, VP of Engineering at Etsy, to discuss the topic of indexing in search systems. They kick off the conversation by discussing traditional indexing methods, such as the forward and reverse indexes, and then dive into the concept of an inverted index, which is used in modern search engines like Lucene と Elasticsearch.
Manju gives valuable insights into how to choose the right type of indexing based on the nature of the query and highlights the importance of constant self-learning for the search system. The episode concludes with a preview of the talk Manju gave at MoCon 2023 in Seattle, where he shared his expertise in designing flexible search systems for the modern world of indexing.
Make sure to subscribe to the Cache-it channel on YouTube so you never miss an episode.
About Manju Rajashkhar
Manju is the Vice President of Engineering at Etsy where he leads two groups: Machine Learning Enablement and Personalization Engine. Manju is a customer-focused and product-centric leader and, over his seven year tenure at Etsy, has led multiple groups from Search and Machine Learning, Ads, Recommendation, Knowledge Base, and Core Buyer Experiences. Prior to Etsy, Manju was the Co-founder and CTO of Blackbird AI, a search and deep learning company, which Etsy acquired. In addition, Manju was part of Twitter’s early engineering and led the building and scaling of caching systems as Twitter grew from 50M to 200M active users. At Twitter, much of the caching software he built to address Twitter scale was open sourced. Two popular open-source systems–twemproxy and twemcache–are still actively used in the industry to build scalable distributed caching. Manju has a graduate degree in computer science from Stanford with a specialization in systems and algorithms.