Top

Tag: Data Pipelines


BI Categories, Data Mining

10 Advantages of Real-Time Data Streaming in Commerce

March 12, 2024

Via: DATAVERSITY

While early science fiction shows like “Buck Rogers” (1939) and “The Fly” (1950) depicted teleportation technology, it was Star Trek’s transporter room that made real-time living matter transfer a classical sci-fi trope. While we haven’t built technology that enables real-time […]


BI Categories, Data Mining

Data Integration Tools

March 12, 2024

Via: DATAVERSITY

Data integration tools are used to collect data from external (and internal) sources, and to reformat, cleanse, and organize the collected data. The ultimate goal of data integration tools is to combine data from a variety of different sources, and […]


Software & Systems

Machine learning for Java developers: Machine learning data pipelines

January 31, 2024

Via: InfoWorld

The article, Machine learning for Java developers: Algorithms for machine learning, introduced setting up a machine learning algorithm and developing a prediction function in Java. Readers learned the inner workings of a machine learning algorithm and walked through the process […]


BI Categories, Data Mining

What Are Data Products and Why Do They Matter?

December 27, 2023

Via: DATAVERSITY

Data products are software in the form of specialty tools and apps that are designed to support data used as a service. They may be as simple and straightforward as a program that converts a dataset into a visualization, or […]


BI Categories, Data Mining

Building Data Pipelines with Kubernetes

December 7, 2023

Via: DATAVERSITY

Data pipelines are a set of processes that move data from one place to another, typically from the source of data to a storage system. These processes involve data extraction from various sources, transformation to fit business or technical needs, […]


BI Categories, Data Mining

Why Is Data Quality Still So Hard to Achieve?

October 25, 2023

Via: DATAVERSITY

We exist in a diversified era of data tools up and down the stack – from storage to algorithm testing to stunning business insights. In fact, it’s been more than three decades of innovation in this market, resulting in the […]


BI Users, Data Analyst

Testing and Monitoring Data Pipelines: Part Two

June 19, 2023

Via: DATAVERSITY

In part one of this article, we discussed how data testing can specifically test a data object (e.g., table, column, metadata) at one particular point in the data pipeline. While this technique is practical for in-database verifications – as tests […]


BI Users, Data Analyst

Data Observability is the Key to Ensuring Fresh and Reliable Data Pipelines

June 14, 2023

Via: Database Trends and Applications

Dealing with data and databases is laden with a multitude of challenges, often characterized by the questions, “What happened to my data?” and “Why is this data all wrong?” Whether data is stale or unreliable, the solution lies within the […]


BI Users, Data Analyst

Testing and Monitoring Data Pipelines: Part One

May 26, 2023

Via: DATAVERSITY

Suppose you’re in charge of maintaining a large set of data pipelines from cloud storage or streaming data into a data warehouse. How can you ensure that your data meets expectations after every transformation? That’s where data quality testing comes […]


BI Users, Data Analyst

Data Pipelines: An Overview

March 2, 2023

Via: DATAVERSITY

Just as vendors rely on U.S. mail or UPS to get their goods to customers, workers count on data pipelines to deliver the information they need to gain business insights and make decisions. This network of data channels, operating in […]


BI Categories, Data Mining

How to Assess Data Quality Readiness for Modern Data Pipelines

February 13, 2023

Via: DATAVERSITY

For growth-minded organizations, the ability to effectively respond to market conditions, competitive pressures, and customer expectations is dependent on one key asset: data. But having just massive troves of data isn’t enough. The key to being truly data-driven is having […]