Hi, I'm Evan 👋

Compliance Data Strategist & Aspiring Machine Learning Engineer

Evan Jones Portfolio

Hi, I'm Evan 👋

Compliance Data Strategist & Aspiring Machine Learning Engineer

Sephora Vector Database Benchmarking

Sephora Vector Database Benchmarking

“Times Square Sephora” by m01229 is licensed under CC BY 2.0.

As part of a class on database preparation and management at Northwestern, I put together a benchmark test for 3 vector DBs using 90,000 Sephora product reviews.

MongoDB was ultimately recommended for its expansive functionality, easy setup, and accessible documentation. It’s fastest at reading records and in the middle when it comes to creating records. Those two operations are more important for a product review engine.

Below are the results that were compiled after running three trials for each database:



Trial question results for each DB

Word cloud of Sephora product reviews

Average create time across 3 trials

Average delete, update, read time across 3 trials

Create time in each trial

Delete time in each trial

Read time in each trial

Update time in each trial