14 Best Data Engineering Podcasts, Blogs and Websites

ICYMI: Our hands-on guide to Partitioning Data on S3 is available now for free. Get it here!

Data engineering is a relatively new field within computer science, and is tied closely to data science on the one hand and to DevOps and DBA work on the other. Keeping up with the latest trends and developments can help ensure that your organization works with the most optimal infrastructure to handle big data. To help you do do, we’ve gathered this list of the 14 best data engineering resources – from podcasts to blogs to video libraries. Enjoy!

Podcasts

The increasing popularity of podcasts has not skipped the field of data engineering – there’s not many out there, but the few that exist are highly recommended. Our favorites include:

Data Engineering Podcast

This podcast is one of our favorite ways to stay abreast of industry goings-on and in-depth discussions of new products and developments in data engineering. The Data Engineering Podcast. produced and hosted by Tobias Macey, a DevOps manager at MIT – is an absolute must-listen for anyone who wants to keep up with data engineering trends or dive deeper into the field.

Visit the website: https://www.dataengineeringpodcast.com/

Software Engineering Daily

True to its namesake, this is a daily podcast about software engineering – but many of the topics discussed are highly pertinent to data engineering, with commonly discussed topics including NoSQL, infrastructure optimization and AI architecture. The podcast features a wide variety of guests – both established thought leaders and up-and-coming developers – and with tons of new content published each week, is another great way to keep up with the industry.

Visit the website: https://softwareengineeringdaily.com/

Engineering and Technical Blogs

Learning by example is always a good way to deepen your knowledge – and luckily, some of the most data-intensive companies regularly publish great educational content on their engineering blogs:

Pinterest Engineering

Data engineers at Pinterest tackle a variety of complex data engineering challenges, including Kafka optimization, real-time anomaly detection, knowledge graphs and building a Kubernetes platform.

Visit the blog: https://medium.com/@Pinterest_Engineering

Yelp Engineering and Product Blog

The architecture that powers Yelp’s recommendation systems is fascinating, and this blog offers a glimpse into the types of complex architectures Yelp’s data engineers build using Kafka, Cassandra, Tensorflow and other technologies.

Visit the blog: https://medium.com/@Pinterest_Engineering

Uber Engineering

One of the company’s that has really epitomized big data’s impact on the world around us (for better or worse), the Uber Engineering blog contains tons of useful information on how the company manages infrastructure for truly massive volumes of data.

Visit the blog: https://eng.uber.com/

The Netflix Tech Blog

Another company changing traditional industries with big data, Netflix’s technical blog details how the streaming giant runs its cloud data architecture to power personalized recommendations and other aspects of its service.

Visit the blog: https://medium.com/netflix-techblog

Other Blogs

Additional publications and individual authors you should be following:

AWS Big Data blog

While obviously mostly relevant to AWS users, the AWS blog is highly technical and oriented towards solving specific engineering problems, with very little in the way of promoting specific AWS products. It published a lot of content by both AWS writers and guest contributors and is worth keeping an eye on.

Visit the blog: https://aws.amazon.com/blogs/big-data/

Maxime Beauchemin

This blog is written by a former data engineer at Facebook and AirBnB and the creator of Apache Airflow and Apache Superset. He writes about data engineering as a disciple and the day-to-day life of a data engineer, as well as more technical content related to Airflow.

Visit the blog: https://medium.com/@maximebeauchemin

O’Reilly on Data

The renowned analyst firm behind the Strata data conference regularly publishes excellent in-depth content related to data engineering and data science topics.

Visit the website: https://www.oreilly.com/radar/topics/data/

Communities and Content Hubs

The following websites mostly curated user-generated content around data engineering and data science, but we like their editorial standards and the selection of articles they offer:

Towards Data Science

A community devoted to “concepts, ideas and codes”. While much of the content skews more towards the data science and analytics side, there’s plenty of data engineering goodness to be found here such as  in-depth guides to Apache Spark.

Visit the website: https://towardsdatascience.com/

Analytics Vidhya

A website dedicated to analytics learning, the Analytics Vidhya community offers a mix of content around various data-related topics. While the content is of varying quality and relevance, you can find some excellent contributions on the site.

Visit the website: https://www.analyticsvidhya.com/blog/

KD Nuggets

Don’t be fooled by its simple and unassuming design – KD Nuggets is one of the best places on the web for data science, analytics and AI content. Gregory Piatetsky-Shapiro and his team regularly curate original research and in-depth, thought-provoking articles contributed by users.

Visit the website: https://twitter.com/kdnuggets

Video Libraries

Prefer the moving pictures? Check out some of these Youtube collections:

Martin Kleppmann

Martin Kleppmann is an established thought leader on stream processing and distributed computing – you can definitely learn something from watching one of his video lectures.

Watch the videos: https://www.youtube.com/results?search_query=martin+kleppmann

PyData

A Youtube channel for the community of users and developers of data analysis tools. Tons of new video content every week covering a wide variety of topics.

Watch the videos: https://www.youtube.com/user/PyDataTV/videos

Templates

All Templates

Explore our expert-made templates & start with the right one for you.