<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=315693165909440&amp;ev=PageView&amp;noscript=1">


The Big Data Blog

4 Challenges of Using Databases for Streaming Data (and a Solution)

May 21, 2019 1:59:55 PM / by Eran Levy posted in Big Data, Database, Streaming Data


Read More

Big Data Infrastructure: When to Build, When to Buy

May 14, 2019 4:04:53 PM / by Eran Levy posted in Big Data, Data Architecture, Data Engineering

Every software development team makes build-vs-buy decisions on a regular basis. For most coding problems, someone is offering a packaged or white-label solution. The decision whether to purchase a tool or develop an alternative in-house - to ‘build or buy’ - is typically made ad-hoc based on cost, existing engineering skillsets and organizational culture.

Read More

Kafka vs. RabbitMQ: Architecture, Performance & Use Cases

May 7, 2019 2:42:07 PM / by Eran Levy posted in Data Architecture, Apache Kafka, RabbitMQ


Read More

How to Improve AWS Athena Performance: The Complete Guide

Apr 22, 2019 12:59:57 PM / by Eran Levy


Read More

7 Popular Stream Processing Frameworks Compared

Mar 21, 2019 7:03:50 PM / by Eran Levy

Stream processing is a critical part of the big data stack in data-intensive organizations. Tools like Apache Storm and Samza have been around for years, and are joined by newcomers like Apache Flink and managed services like Amazon Kinesis Streams.

Read More

Cloud Data Lake vs On-Premises Data Lake: What You Need to Know

Mar 18, 2019 12:51:45 PM / by Eran Levy posted in Data Lake, Data Architecture, Cloud

Is it time to move your data lake to the cloud? As with any infrastructural choice, there are advantages and trade-offs to deploying in the cloud vs on-premises, and the decision needs to be made on ad-hoc basis based on considerations such as scale, cost, and available technical resources.

Read More

3 Steps To Reduce Your Elasticsearch Costs By 90 - 99%

Feb 27, 2019 4:47:59 PM / by Eran Levy posted in S3, Elasticsearch, Log Analysis

This article covers best practices for reducing the price tag of Elasticsearch using a data lake approach. Want to learn how to optimize your entire streaming data infrastructure? Check out our technical whitepaper to learn how leading organizations generate value from cloud data lakes. Get the paper now!


Elasticsearch is a fantastic log analysis and search tool, used by everyone from tiny startups to the largest enterprises. It’s a robust solution for many operational use cases as well as for BI and reporting, and performs well at virtually any scale - which is why many developers get used to ‘dumping’ all of their log data into Elasticsearch and storing it there indefinitely.

Read More

4 Key Components of a Streaming Data Architecture

Jan 30, 2019 8:15:38 PM / by Eran Levy

Streaming data is becoming a core component of enterprise data architecture. Streaming technologies are not new, but they have considerably matured over the past year. The industry is moving from painstaking integration of technologies like Kafka and Storm, towards full stack solutions that provide an end-to-end streaming data architecture.

Read More

Integrate Upsolver with Git for Carefree Change Management and Review

Jan 14, 2019 5:37:00 PM / by Eran Levy posted in Product Updates, Git

Today we’ve got some great news for organizations that have multiple users working on Upsolver, or anyone who likes to fiddle with the system and make frequent changes to data sources, output streams, aggregations or other features. Thanks to Upsolver’s new built-in Git integration you can have multiple users fiddling away, safe in the knowledge that all your work will be securely stored and easily recoverable in your Git repository.

Read More

Top 7 Trends in Streaming Data for 2019

Dec 20, 2018 6:43:01 PM / by Ori Rafael posted in Industry Trends, Schemaless, Streaming Data

As the end of the year rapidly approaches, it’s time to take a look at what the next one might have in store.

Read More