Delivering a hyper-personalized product and optimizing campaigns in real time require Clinch to process a high volume of data fast. According to Yaron Cohen, Vice President of Research and Development at Clinch, the company tracks hundreds of millions of anonymous web users, who generate 1 billion events each day. Based on a user’s activity and profile, Clinch models intent and then builds and serves ads that are most appropriate to the user at that moment. Its entire infrastructure runs on Amazon Web Services (AWS) and has done so since the firm was founded in 2012.
At the beginning, Clinch was streaming all events into a single NoSQL data store, where its teams could query the raw data or aggregate it into various states to be used by Clinch’s products. As the company grew to its current size, this data store was becoming a bottleneck.
Cohen and his colleagues looked at various data-streaming technologies, but no solution was exactly right. Cohen says, “Every technology we tested involved fixing together a lot of code, so each change we made required significant manual work. We figured we’d need a team of developers just to make these changes, perhaps to build an automation layer that supported the entire data pipeline for our product teams.”
Far from just moving Clinch’s data from source to target, Upsolver removes a lot of the manual work involved in data transformation. This includes partitioning, merging small files, and converting raw data into useable columnar file formats like Apache Parquet. As well as saving time via automation, Clinch avoids the cost and hassle of testing its own data transformations and fixing the inevitable bugs that would be created.
Says Cohen, “If we had used a product like Apache Spark, we’d have developers writing Scala code to create transformations, manage clusters, and orchestrate workflows to make sure jobs run efficiently and on time. Upsolver automates the majority of that for us, which saves us money, but also allows us to provide a better service to customers.”
Because it’s easier to make changes to its data pipeline, Clinch has roughly doubled the number of features available to clients every month since it began working with Upsolver. “Ad tech is a competitive industry. We compete on speed, but Upsolver also helps us deliver new features that bring better ROI for customers,” says Cohen.
Because Upsolver’s automations move incoming data into Clinch’s data lake within around 2 minutes, end users see more relevant ads because the data is fresher. Cohen says, “Now we can perform more optimizations, measure the performance of each algorithm in real time, and adjust it as needed. It has made a real difference in our ability to innovate for our clients.”