← All cases
04

Legal Entity Data Aggregation and Analysis System

We turn chaos from 8+ sources into a unified knowledge base for banks, investors, and regulators.

Legal Entity Data Aggregation and Analysis System

The problem we solve

Working with open company data is a challenge: sources contradict each other, information becomes outdated, and data errors lead to financial and reputational risks.

Our system turns chaos from 8+ fragmented sources (XML, JSON, CSV) into a structured knowledge base.

It is a tool for banks, investors, and regulators where every statement contains accurate data about a company, its finances, founders, and brands.

How it works

For a national-scale project, we implemented:

Data Import

Automatic ingestion from sources with different formats and APIs, including processing of weakly structured files.

Data Cleansing and Merging

Error correction algorithms (duplicates, typos, incorrect dates and data) based on rules and classical ML models. Complex data matching across sources even when the information is contradictory, for example when the newest data is already outdated.

Statement Generation

Unified company dossiers with current information on finances, founders, brands, and historical changes.

Advanced Analytics Capabilities

Find companies using a flexible filtering system by combining more than 50 different parameters, including industry, region, financial indicators, ownership structure, and more. The system can also find similar companies based on a selected set of characteristics, providing a powerful tool for market and competitor analysis.

Technologies

LaravelOctanePostgreSQLRedisElasticSearchVue 3Nuxt3PythonMongoDBApache SupersetDocker clusters with redundancyGitLab CI/CDGrafanaSentry

Benefits

Reliability

Data is cleaned and verified before entering the system, so you make decisions based on facts rather than assumptions.

Speed

Search across 28M+ companies takes less than one second thanks to ElasticSearch.

Depth of Analysis

Identify company relationships, track changes in financial indicators, and build forecasts.

Dependability

Redundancy and automatic resource adaptation to current load guarantee 24/7 operation even during peak demand.