Technology intelligently weaved to deliver the most elegant solution. Not just because it is hot.
We believe in using the best tools to solve business problems, and our architecture choices are made through careful optimisations for
- highest accuracy
- run time performance
- ease of use
- applicability to multiple domains
- fast configuration
Reifier’s proprietary AI engine learns string similarity from data and generalizes that to deduce the optimal fuzzy matching rules for any domain and language. Hence, Reifier can easily work with any domain and language to provide AI based master data management. If there is no training data, Reifier Interactive Learner samples the data and lets the user mark some pairs as matching and non matching. Typically, about 20-50 pairs are all that is needed, which can be easily marked by even the support staff. Check how it works here.
Reifier utilizes Spark for distributed entity resolution, deduplication and record linkage. Computing in memory helps Reifier iterate very fast over different permutations of possible record linkages to generate the best AI model for multidomain mdm. Our proprietary machine learning algorithms sit on top of Spark along with Apache Cassandra and Elastic to provide high performance master data management, entity resolution and fuzzy data matching with a scale out distributed architecture.
Reifier has been covered in major international conferences data analytics and AI .
Some of the features of Reifier fuzzy record matching and deduplication technology are
- Reifier can generalize from training samples to provide very high data matching accuracy
- As no hand coding of rules is needed, deployment of fuzzy matching is blazing fast
- Developers can concentrate on business logic instead of figuring record matching algorithms
- Reifier can be used in multiple domains with different fields
- Reifier is language agnostic and can match data in chinese, english, japanese, thai and other languages
- Scale out architecture – run on single machine or full blown Spark cluster for deduplication and record linkage
Depending on data volumes, Reifier can be simply run on a standalone machine with Spark libraries if your organization has not adopted big data yet. A standalone machine is good for upto a few million records. If you already have a Spark cluster, plan to deploy one or need our help to do so, let us know.