WHITE PAPERAI Based Matching for Legal Entity Mastering


Machine Learning (ML) and artificial intelligence (AI) are revolutionary tools that are dominating the application space and are appearing in many aspects of our lives. Every financial institution stores multiple security master files from a multitude of vendors. Ensuring data quality and making decisions based upon best-in-class data sets, verifying the accuracy, and having the ability to impose overrides for proprietary company data sets is one of the most challenging problems firms face today. This white paper describes how we are using ML and AI to improve quality in legal entity (LE) matching for the investment banking industry.


West Highland’s (WHSS) Legal Entity Management (LEM) service via our Intelligent Data Platform (IDP), combines multiple data vendors into a single, aggregated feed, providing our clients with REST API access to multiple data points relating to pre-matched public and private entities from around the world.


These fields are defined according to client requirements and are typically ingested into their applications or customer relationship management (CRM) systems. IDP provides an intuitive administrative GUI front end for direct access to the data, enabling firms to manually override or adjust as the need arises.

The matching is performed using our sophisticated machine learning algorithm, allowing us to accurately match and dynamically link fields from traditional or alternate data sources. Each legal entity is assigned a unique WHSS ID.


Public and private LEs from different vendors are represented in different ways, from the name of the LE to the entity type, country of domicile and other attributes. This data can be prone to errors with differences in name, company legal structure, and other fields, with some vendors even providing additional comments as part of the name field. So, a fixed rules-based / fuzzy-logic approach does not always work reliably as all variations in attributes cannot be anticipated. An AI solution learns to identify matched entities in a more general way and is better at handling unforeseen or unexpected situations.

We utilize AI to identify possible errors that we analyze and feed back to individual vendors for correction, ensuring a higher quality of reference data to our customers.


The AI is developed using the industry leading, and most popular framework – TensorFlow. It is Open Source, developed by Google and has become a favorite AI tool of the some of the top global companies. IDP is cloud based and therefore highly scalable.


We cleanse, normalize, and then convert all relevant attributes, that allow for a unique identification of a LE, into vectors, using the latest ML techniques and our proprietary algos. We also generate pseudo random data to create large training data sets. These data sets not only ensure a generalized solution but also ensure we have sufficient examples of all required attribute combinations.

The vectors are then presented to a deep learning AI neural network for training and verification. Once a robust and highly accurate solution is reached, the AI is further tested against never-before-seen data, and ultimately deployed for production use.

The neural network is monitored for accuracy, and will be re-trained as needed, e.g., as datasets are added, or significantly change.


IDP is designed from the ground up for scalability and ease of expansion, and so new data fields and vendor sources can be quickly on-boarded at low cost, depending on complexity. Additional API calls can also be rapidly added as needed.

WHSS subscribes to daily vendor feed updates (delta files). These are processed and entity changes are re-matched and available to the production feed before start of day.

There are a number of administrative functions available to manage IDP, which WHSS can perform on behalf of the client. These include:

  • LEs can be manually linked or unlinked.
  • AI matching thresholds can be defined and changed.
  • AI linked LEs can be overridden.
  • Possible AI identified vendor errors can be highlighted for analysis.
  • Changes to client LEs can be electronically communicated back to vendors on behalf of the client.

West Highland’s ML based matching solution represents the next generation of LE mastering capability, providing the benefits of higher accuracy to the client as well as more seamless integration of alternate and less well-structured data sources.


West Highland Support Services is a vendor-agnostic managed service provider, recognized globally as an industry authority and thought partner for market data, referential and professional services for over 20 years.

West Highland is referred to as “The Gold Standard” for managing, installing and architecting robust market data platforms. We continue to lead the industry with innovative tools and services that increase visibility, manage capacity and maximize uptime while reducing overall operating expense.