Iceberg Table Analyzer

Improve Query Performance, Reduce Compute Costs

This free and open source Iceberg Table Analyzer will quickly analyze your existing lakehouse and identify Iceberg tables that can be optimized.

Get implementation instructions

Get Iceberg Optimization Insights

Like any modern data platform, a Lakehouse requires continued maintenance and optimization to yield the best query performance, reduce storage costs and comply with data privacy regulations. Using the Upsolver open-source Iceberg Table Analyzer you can quickly and easily scan and analyze existing Iceberg tables to identify potential areas of improvement.

The analyzer helps you identify the following:

Full Scan Overhead

This measures the time it takes a query engine like Apache Trino to read the entire table. Improving this metric can lead to faster queries, especially those that need to scan and process large portions of or the entire dataset.

Worst Partition Scan Overhead

This is the time it takes to scan the worst-performing partition of the table. Partitions with a high scan overhead can be a bottleneck for queries that target specific segments of data.

Total File Count

This indicates the number of files in the table. A large number of files requires maintaining significantly more metadata resulting in slow query planning and lower perceived query performance.

Worst Partition File Count

This shows the number of files in the most bloated partition. A large number of files in a partition can cause slower performance due to increased overhead in managing those files during query execution.

Average File Size

It’s the average size of the files in the table. Small files can lead to a “small file problem,” where the overhead of opening and closing files can dominate query execution time degrading overall query performance

Total Table Size

This represents the total amount of data stored in the table. While this metric doesn’t directly affect performance, it gives a sense of the scale of the data and potential storage costs.

Empowering warehouses, lakehouses and data lakes

From startups to enterprises

Templates

All Templates

Explore our expert-made templates & start with the right one for you.