Deduplicating 401,000 Equipment Auction Records with LLM Calibration
📰 Dev.to · benzsevern
We ran GoldenMatch on 401,125 bulldozer auction records from Kaggle. Iterative LLM calibration learned the optimal match threshold from just 200 pairs (~$0.01). ANN hybrid blocking recovered 949 records that string blocking missed.
DeepCamp AI