Database optimization
Audit and optimization of the database and OLAP queries for the marketing analytics system
TECH STACK
MariaDB | ClickHouse | Python 3
AUDIT & OPTIMIZATION
7bits
There is a SaaS technology for collecting, analyzing, and optimizing advertising campaigns.

The customer asked us to optimize long queries to the database, that affected the user's experience with the service.

To solve this problem, we adapted existing queries, moved existing data to the ClickHouse database, and created a script for importing data from MariaDB. Thanks to this, our team was able to significantly increase the speed of obtaining data used in displaying statistics in the interface of the system.

Задачи
Audit of MariaDB data storage (~2TB) and page queries

Migration of existing data to ClickHouse

Optimization of page requests to reduce the time to retrieve data collected over 6 month to 5 seconds

Implementation of the script for incremental importing from MariaDB to ClickHouse

Performing speed tests on the new DB

Giving recommendations for the future optimizations of data flow
The graph shows the exectution time of the main queries for statistics retrieving.

Queries for obtaining statistical data are complex because they involve aggregating a large amount of data, which leads to long execution of such queries.

As you can see on the graph, the duration of some requests reached almost 7 minutes!

The service used MariaDB as a DBMS.
Before optimization
We analyzed the existing queries and decided to modify the database structure and move the data to a more suitable DBMS.

For statistical data, we decided to use Clickhouse, which allowed us to significantly speed up the process of obtaining them.

The graph shows the time in seconds for the same queries. We managed to reduce the average request time to 1 second. The time of the longest request was reduced from 7 minutes to 4.5 seconds.
After optimization
Thanks to the use of the new database structure and Clickhouse as a DBMS, our team was able to significantly increase the speed of obtaining data used for displaying statistics in the interface of the system.
Look at our other projects