edu.uky.kcr.recordlinkage.engine
Interface LinkageEngine

All Known Implementing Classes:
AbstractLinkageEngine, ExactMatchLinkageEngine

public interface LinkageEngine

An instance of a LinkageEngine is responsible for comparing records from two data sources to determine if any of them are the same. This class encapsulates most of the work involved in a linkage operation. This is where deterministic or probabilistic algorithms will be used to compare field values from records in both the primary and secondary data sources. The most convenient way to implement this interface is to extend the AbstractLinkageEngine class instead of implementing the interface directly.

Implementers of the LinkageEngine interface should extend the AbstractLinkageEngine convenience class instead of implementing LinkageEngine directly.

Author:
ihands

Method Summary
 LinkageResultSet findLinkedRecords(LinkageDataSource primaryDataSource, LinkageDataSource secondaryDataSource)
          Primary workhorse method for a linkage operation, this is where deterministic and probabilistic methods are used to match records.
 java.lang.String getName()
           
 void initialize(LinkageConfiguration linkageConfiguration)
          This method is called immediately after the LinkageEngine is created, before findLinkedRecords(LinkageDataSource, LinkageDataSource) is called.
 

Method Detail

initialize

void initialize(LinkageConfiguration linkageConfiguration)
This method is called immediately after the LinkageEngine is created, before findLinkedRecords(LinkageDataSource, LinkageDataSource) is called.

Parameters:
linkageConfiguration - Configuration object containing the BlockingConfiguration, MatchingConfiguration, and cutoff scores necessary for a LinkageMatch to be determined by this engine.

findLinkedRecords

LinkageResultSet findLinkedRecords(LinkageDataSource primaryDataSource,
                                   LinkageDataSource secondaryDataSource)
                                   throws LinkageException
Primary workhorse method for a linkage operation, this is where deterministic and probabilistic methods are used to match records.

Parameters:
primaryDataSource - A source of DataSourceRecords to be linked, typically the larger data set.
secondaryDataSource - A second source of DataSourceRecords to be linked, typically the smaller data set.
Returns:
An instance of a LinkageResultSet containing the complete list of positive and indeterminate LinkageMatch objects, as well as any unmatched records.
Throws:
LinkageException

getName

java.lang.String getName()
Returns:
the name of the LinkageEngine as specified by the "name" attribute in a LinkageConfiguration file