Wednesday 11:45
in Platinum3
- A description of the problem of duplicate records and their impact on businesses
- An overview of the proposed solution
- How to use GenAI models and techniques to identify potential duplicate records
- Step 1: identify your columns to match on
- Step 2: creating embedding vectors for these columns
- Step 3: creating match clusters
- Step 4: presenting those cluster to the users who can then choose what to do with the duplicates
Ian Ormesher
Ian Ormesher is a seasoned full-stack Data Scientist with a robust background in training and deploying AI models in production environments. With a career spanning over four decades, he has honed his skills in Machine Learning, Deep Neural Networks, Reinforcement Learning, and Computer Vision. He is proficient in a wide array of programming languages and data analysis tools with a proven track record of implementing data-oriented solutions in the Cloud.