Legal Entity Data Aggregation and Analysis System
We turn chaos from 8+ sources into a unified knowledge base for banks, investors, and regulators.

The problem we solve
Working with open company data is a challenge: sources contradict each other, information becomes outdated, and data errors lead to financial and reputational risks.
Our system turns chaos from 8+ fragmented sources (XML, JSON, CSV) into a structured knowledge base.
It is a tool for banks, investors, and regulators where every statement contains accurate data about a company, its finances, founders, and brands.
How it works
For a national-scale project, we implemented:
Data Import
Automatic ingestion from sources with different formats and APIs, including processing of weakly structured files.
Data Cleansing and Merging
Error correction algorithms (duplicates, typos, incorrect dates and data) based on rules and classical ML models. Complex data matching across sources even when the information is contradictory, for example when the newest data is already outdated.
Statement Generation
Unified company dossiers with current information on finances, founders, brands, and historical changes.
Advanced Analytics Capabilities
Find companies using a flexible filtering system by combining more than 50 different parameters, including industry, region, financial indicators, ownership structure, and more. The system can also find similar companies based on a selected set of characteristics, providing a powerful tool for market and competitor analysis.
Technologies
Benefits
Data is cleaned and verified before entering the system, so you make decisions based on facts rather than assumptions.
Search across 28M+ companies takes less than one second thanks to ElasticSearch.
Identify company relationships, track changes in financial indicators, and build forecasts.
Redundancy and automatic resource adaptation to current load guarantee 24/7 operation even during peak demand.
