Building a Semantic Book Search: Scale an Embedding Pipeline with Apache Spark and AWS EMR…
Image from UnsplashBuilding a Semantic Book Search: Scale an Embedding Pipeline with Apache Spark and AWS EMR ServerlessUsing OpenAI’s Clip model to support natural language search on a collection of 70k book coversIn a previous post I did a little PoC to see if I could use OpenAI’s Clip model to build a semantic book search. It worked surprisingly well, in my opinion, but I couldn’t help wondering if it would be better with more data. The previous version used only about 3.5k books, but there are millions in the…