Resources, Guides and Best Practices

Amazon Athena

Learn everything you need to get started with Amazon Athena, or discover new best practices that can help you improve performance and reduce costs. Browse through our library of blogs, ebooks, webinars and videos to find answers to your most pressing questions about Athena and the related ecosystem.


Thank you! One of our data experts will be reaching out to you.

What is Amazon Athena?

Amazon Athena is a serverless, interactive query service that reads data directly from Amazon S3 object storage. It is based on the open-source Apache Presto, but offered exclusively as a managed service by Amazon Web Services. Athena allows you to analyze very high volumes of data using ANSI-SQL, without the need to manage infrastructure.


Unlike a traditional database, Athena doesn’t store data in rest – storage is done entirely on S3, while the compute resources needed to return results for a query are provisioned automatically by AWS. Pricing follows a similar logic and based on terabytes of data scanned ($5 per terabyte).

What are the benefits of Amazon Athena?

  • Less reliance on IT: Running Athena does not require users to monitor infrastructure, spin up additional clusters or manually provision resources, making it simpler to get started.
  • Open architecture: Athena supports commonly-used open source formats such as JSON, CSV and Apache Parquet, which reduces vendor lock-in and enables users to employ additional querying and analytics tools as needed.
  • Decoupled storage and compute: Since Athena utilizes Amazon S3 for data storage, costs are exponentially lower compared to storing the same amounts of data in a coupled database. This allows organizations to store exponentially higher volumes of data without incurring significant additional costs.
  • SQL-based: SQL is the language of choice for most data analysts and DBAs, and is simpler to work with compared to Python or Scala. Athena queries are written in regular ANSI-SQL, which makes it accessible by almost every data professional or developer.

Read more about the benefits of Athena compared to traditional databases.

What are the different use cases for Amazon Athena?

Generally speaking, Athena can be relevant whenever you want to query data stored on Amazon S3. Common scenarios include:

  • Streaming analytics: Querying and visualizing streaming sources such as web click-streams in real time, or near-real time.
  • Ad-hoc analytics on big data: Quickly answering a specific question that requires you to scan terabytes of data, without setting up infrastructure.
  • Redshift cost reduction: While Redshift ensures very high performance, it is a coupled database which can become costly and complex to operate at higher scales; in these cases, Athena and S3 storage can be used to reduce some of the operational cost of Redshift.

You can learn more about Athena use cases here.

Benchmarks and comparisons

Analyzing streaming data in Athena

Streaming data is challenging in unique ways. See how to build efficient data pipelines that enable you to query event streams in Athena:

ETL and Data Preparation for Athena

Athena reads data directly from Amazon S3, and the way data is stored on S3 can have a dramatic impact on how much value you get from Athena. Discover best practices around partitioning, compaction and file formats to learn how to optimize your data for analytic consumption.

Improving Athena Performance

Check out our handy guides to learn how to make your Athena queries run faster.

Related Resources

Video: ETL for AWS Athena

Discover the 6 crucial guidelines to prepare your data for Athena in this free webinar.

Watch the Webinar »

Improve Athena Performance

Learn how to avoid common pitfalls, reduce costs and ensure high performance.

Get the Guide »

S3 Partitioning Guide

Partibest practices you need to know in order to optimize your analytics infrastructure. Get the Guide »

Get the Guide »

GDPR and PII Removal

Enforce GDPR and other regulatory compliance by keeping track of all user records in your lake

Read the Case Study »

Powering Innovation Across Industries

data lake ETL Demo

Start for free with the Upsolver Community Edition.

Build working solutions for stream and batch processing on your data lake in minutes.

Get Started Now