Techno Blender
Digitally Yours.
Browsing Tag

Linacre

The Intuition Behind the Use of Expectation Maximisation to Train Record Linkage Models | by Robin Linacre | Oct, 2022

How unsupervised learning is used to estimate model parameters in SplinkPhoto by Suzanne D. Williams on UnsplashSplink is a free probabilistic record linkage library that predicts the likelihood that two records refer to the same entity. For example, what is the probability that the following two records match?Example pairwise record comparisonThe underlying statistical model is called the Fellegi Sunter model. It works by computing partial match weights, which are a measure of the importance of the different columns in…