Home
>
Data Science
>
Update How InfluxDB and Flux Gather Meaningful Insights from Time Series Data

March 28, 2022 by Phu Nguyen

Update How InfluxDB and Flux Gather Meaningful Insights from Time Series Data

Main Contents:

How InfluxDB and Flux Gather Meaningful Insights from Time Series Data is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn How InfluxDB and Flux Gather Meaningful Insights from Time Series Data in today’s post !

The Context

We’ve thus received positive responses from engineers and operators regarding the functionality of InfluxQL, InfluxDB’s query language, in an industrial context. With the help of Grafana’s excellent query editor, in their dashboarding product, it’s easy to use, easy to learn and very readable. Despite this, advanced functionality was often lacking because the context was missing.

What do we mean by context? Imagine you’re sitting in a history class and you’re learning about event X in the year 1721, event Y in 1857 and so on. This history class is boring because all you’re being taught are dates and times of events, in order. This is comparable to time series data — data presented by date and in sequential order.

Imagine now that instead of learning just the dates and times, you receive information about the people that lived at that time, how they thought, what their days looked like, what their surroundings looked like, etc. This is context. This information provides you the context behind the history and behind the events, so now you can start understanding why people made the decisions they made. While the dates and times are important, an understanding of the people, the time periods and the events teach you so much more. Similarly, time series data is crucial but it is far more informative when coupled with contextual data and vice versa.

In an industrial setting, there are two levels of context seen from a technical perspective. The first level derives context from other data that is available in the time series data database. Examples of this include knowing that the value of a sensor only matters when a machine is in production and not in maintenance. Another example consists of calculating the difference between two pressure sensors at the beginning and end of a process. In short, this boils down to the functionality that was missing in the versions of InfluxDB prior to the introduction of their new query language, Flux. Flux comes with the ability to perform math across measurements (measurements are the container of the time series collected).

The second level of functionality comes from integration with other IT systems in a manufacturing context, typically described as the manufacturing execution system, or the MES-layer. Knowing the pressure or temperature evolution of a certain sensor in a machine tells you something, but it will tell you a lot more if you know what product was being produced, which operator was responsible for the line, what shift it was, what the machine’s setpoints were set to, etc. This information is typically present in other software and databases that support the manufacturing process. As this is often relational data, this currently falls out of scope with what we have currently achieved with Flux. But Flux will bring us the ability to bring multiple data sources together, so we anticipate interesting things to come.

Examples of Flux and Math Across Measurements in Industry

With InfluxDB and Flux, we’ve been able to extract more insights and value out of the time series data we collect. We will review a few use cases mentioned above, in more detail.

Use Case 1: Calculating Pressure Drop Across a Filter

A company with a water purification system needs to determine when they should replace filters used in the purification process. They can physically walk up to the filter (if accessible) and inspect the filter to determine if it needs to be replaced or they can rely on data. In the graph below, the top panel shows the water pressure measurement collected at the input and the panel below that shows the water pressure measurement collected at the output. If you take the difference between the two measurements, you can see the pressure difference increase over time. This is depicted below. With this data, you can see that when the pressure drop exceeds a certain threshold, the filter needs to be replaced.

Raw pressure sensor data from PT01 (Pressure Transmitter 01) on the input side before the filter is plotted above. PT02 is on the output side. The calculated pressure difference is plotted in the bottom graph.

The Flux query to achieve this:

// Create a generic function that can be reused to compute the 3m average

avg3m = (measurement) =>

from(bucket: “historian”)

|> range(start:2018–05–01T23:30:00Z, stop: 2018–05–23T00:00:00Z)

|> filter(fn: (r) => r._measurement == measurement)

|> aggregateWindow(every: 3m, fn: mean)

// Use the function we just defined to get the data for the first

// and second pressure transmitters

pt01 = avg3m(measurement: “PT01”)

pt02 = avg3m(measurement: “PT02”)

// Join data and calculate the pressure drop

join(tables: {pt01:pt01, pt02:pt01}, on: [“_time”])

|> map(fn: (r) => ({

_time: r._time,

_pressureDrop: r._value_pt01 – r._value_pt02

}))

|> aggregateWindow(columns: [“_pressureDrop”], every: 1h, fn: mean) // to smoothen the graph a bit

Use Case 2: Sensor Values in the Context of a Machine State

In some cases, you might have a machine that runs 24/7 but is only doing meaningful work some of the time. We often encounter this with discrete workstations, for example, Computer Numerical Control (CNC) machines. In these cases, you might only be interested in the values of certain sensors given that the machine is in a state of “production” and not in the state of “waiting for operator input” or “maintenance.” In this case, we’re adding context (i.e. the machine state). Let’s construct this graph in four steps.

Step 1: First, we retrieve the raw machine status. This is different for every machine, as is the way each machine is programmed by the supplier or automation partner. In this case, we have over 200 status codes. As such, the resulting graph is pretty meaningless by itself.

The raw status codes from the machine.

The Flux query to achieve this:

// Retrieve status data. We’re not downsampling here because status codes are only written to the database on change, which means this is a very small dataset.

from(bucket: “historian”)

|> range(start: dashboardTime, stop: upperDashboardTime)

|> filter(fn: (r) => r._measurement == “Status”)

Step 2: Each status code represents a certain machine state. In our case, we’re only interested in status codes five and 10, because they represent the machine being in production. All other codes are irrelevant for our use case. So, let’s map these raw status codes to one when the machine is in production and zero in all other cases.

The mapped status codes.

The Flux query to achieve this:

Status = from(bucket: “historian”) 
 |&gt; range(start: dashboardTime, stop: upperDashboardTime) 
 |&gt; filter(fn: (r) =&gt; r._measurement == “Status”) 
 |&gt; map(fn: (r) =&gt; ({_time: r._time, _status: float(v: contains(value: int(v: r._value), set: [5,10])}), mergeKey: false) // map to 0 or 1 
// The statements below could probably be optimized. They work for our use case but should be improved. 
// Because the raw status code is only written to the database on change but we want to join on time later, we need to create time windows and fill these with the last known 0 or 1. So, we first create this windows with averages between 0 and 1. 
 |&gt; aggregateWindow(every: 30s, fn: mean, columns: [“_status”], createEmpty: true) 
// Then we fill the empty windows with the previous values. 
 |&gt; fill(column: “_status”, usePrevious: true) 
// We don’t have a previous value at the beginning of our time window, so we cheat by filling it with 0. Ideally, we would fill with the last status value that falls just outside the start: dashboardTime range. 
 |&gt; fill(column: “_status”, value: 0.0) 
// Every window that contains a status &gt; 0 should be categorized as “in production”. 
 |&gt; map(fn: (r) =&gt; ({_time: r._time, _status: math.ceil(x: r._status)}), mergeKey: false)

Status = from(bucket: “historian”)

|> range(start: dashboardTime, stop: upperDashboardTime)

|> filter(fn: (r) => r._measurement == “Status”)

|> map(fn: (r) => ({_time: r._time, _status: float(v: contains(value: int(v: r._value), set: [5,10])}), mergeKey: false) // map to 0 or 1

// The statements below could probably be optimized. They work for our use case but should be improved.

// Because the raw status code is only written to the database on change but we want to join on time later, we need to create time windows and fill these with the last known 0 or 1. So, we first create this windows with averages between 0 and 1.

|> aggregateWindow(every: 30s, fn: mean, columns: [“_status”], createEmpty: true)

// Then we fill the empty windows with the previous values.

|> fill(column: “_status”, usePrevious: true)

// We don’t have a previous value at the beginning of our time window, so we cheat by filling it with 0. Ideally, we would fill with the last status value that falls just outside the start: dashboardTime range.

|> fill(column: “_status”, value: 0.0)

// Every window that contains a status > 0 should be categorized as “in production”.

|> map(fn: (r) => ({_time: r._time, _status: math.ceil(x: r._status)}), mergeKey: false)

Step 3: Retrieve the raw sensor data. This is standard Flux functionality.

We notice noise on the sensor between 17:00 and 18:30. This is because the machine is switched on but not in production. As such, these values are meaningless to us.

The Flux query to achieve this:

NTU = from(bucket: “historian”)

|> range(start: dashboardTime, stop: upperDashboardTime)

|> filter(fn: (r) => r._measurement == “S01”)

|> aggregateWindow(every: 30s, fn: mean)

|> keep(columns: [“_value”, “_time”])

// Like in Step 2, we first fill with previous and finally 0 at the beginning of our time window

|> fill(column: “_value”, usePrevious: true)

Step 4: Finally, we’re ready to put all of the pieces together. From Step 2, we have 0/1 data telling us when the machine is in production and when it is not. From Step 3, we have the raw sensor data. Now, let’s bring these data sets together by joining them. The resulting graph shows us the sensor values when the machine is in production status.

The raw sensor data, in the context of the machine being in production status.

The Flux query to achieve this:

Sensor = from(bucket: “historian”) 
 |&gt; range(start: dashboardTime, stop: upperDashboardTime) 
 |&gt; filter(fn: (r) =&gt; r._measurement == “S01”) 
 |&gt; aggregateWindow(every: 30s, fn: mean) 
 |&gt; keep(columns: [“_value”, “_time”]) 
 |&gt; fill(column: “_value”, usePrevious: true) 
 |&gt; fill(column: “_value”, value: 0.0)
<div style="clear:both; margin-top:0em; margin-bottom:1em;"><a href="https://www.inapps.net/the-benefits-of-ai-for-service-and-operations-management-inapps-2022/" target="_blank" rel="dofollow" class="u425d735b320c739f9e66bf852f1f4b5e"><style> .u425d735b320c739f9e66bf852f1f4b5e { padding:0px; margin: 0; padding-top:1em!important; padding-bottom:1em!important; width:100%; display: block; font-weight:bold; background-color:inherit; border:0!important; border-left:4px solid inherit!important; text-decoration:none; } .u425d735b320c739f9e66bf852f1f4b5e:active, .u425d735b320c739f9e66bf852f1f4b5e:hover { opacity: 1; transition: opacity 250ms; webkit-transition: opacity 250ms; text-decoration:none; } .u425d735b320c739f9e66bf852f1f4b5e { transition: background-color 250ms; webkit-transition: background-color 250ms; opacity: 1; transition: opacity 250ms; webkit-transition: opacity 250ms; } .u425d735b320c739f9e66bf852f1f4b5e .ctaText { font-weight:bold; color:#141414; text-decoration:none; font-size: 16px; } .u425d735b320c739f9e66bf852f1f4b5e .postTitle { color:#E67E22; text-decoration: underline!important; font-size: 16px; } .u425d735b320c739f9e66bf852f1f4b5e:hover .postTitle { text-decoration: underline!important; } </style><div style="padding-left:1em; padding-right:1em;">Read More:   The Benefits of AI for Service and Operations Management – InApps 2022</div></a></div>Status = from(bucket: “historian”) 
 |&gt; range(start: dashboardTime, stop: upperDashboardTime) 
 |&gt; filter(fn: (r) =&gt; r._measurement == “Status”) 
 |&gt; map(fn: (r) =&gt; ({_time: r._time, _status: float(v: contains(value: int(v: r._value), set: [5,10])}), mergeKey: false) 
 |&gt; aggregateWindow(every: 30s, fn: mean, columns: [“_status”], createEmpty: true) 
 |&gt; fill(column: “_status”, usePrevious: true) 
 |&gt; fill(column: “_status”, value: 0.0) 
 |&gt; map(fn: (r) =&gt; ({_time: r._time, _status: math.ceil(x: r._status)}), mergeKey: false)
join( 
 tables: {n:Sensor, s:Status}, 
 on: [“_time”] 
) 
 |&gt; map(fn: (r) =&gt; ({ 
	_time: r._time, 
// All raw sensor values get multiplied by 0 when the machine is not in production 
	_filteredValue: r._value * r._status 
 })) 
 |&gt; yield()

Sensor = from(bucket: “historian”)

|> range(start: dashboardTime, stop: upperDashboardTime)

|> filter(fn: (r) => r._measurement == “S01”)

|> aggregateWindow(every: 30s, fn: mean)

|> keep(columns: [“_value”, “_time”])

|> fill(column: “_value”, usePrevious: true)

|> fill(column: “_value”, value: 0.0)

Status = from(bucket: “historian”)

|> range(start: dashboardTime, stop: upperDashboardTime)

|> filter(fn: (r) => r._measurement == “Status”)

|> map(fn: (r) => ({_time: r._time, _status: float(v: contains(value: int(v: r._value), set: [5,10])}), mergeKey: false)

|> aggregateWindow(every: 30s, fn: mean, columns: [“_status”], createEmpty: true)

|> fill(column: “_status”, usePrevious: true)

|> fill(column: “_status”, value: 0.0)

|> map(fn: (r) => ({_time: r._time, _status: math.ceil(x: r._status)}), mergeKey: false)

join(

tables: {n:Sensor, s:Status},

on: [“_time”]

)

|> map(fn: (r) => ({

_time: r._time,

// All raw sensor values get multiplied by 0 when the machine is not in production

_filteredValue: r._value * r._status

}))

|> yield()

Use case 3: Calculating the Error Between a Forecast and the Measured Value

A third and final use case involves calculating the mean absolute error (MAE) to score the accuracy of a forecast compared to the measured values. This could be helpful in cases where your service is dependent on weather forecasts, for example. This formula returns a single value.

The MAE is calculated as follows:

The Flux query to achieve this:

WINDSPEED = from(bucket: “meteo”) 
 |&gt; range(start: dashboardTime, stop: upperDashboardTime) // same window from which we have forecast data 
 |&gt; filter(fn: (r) =&gt; r._measurement == “WindSpeed”) 
 |&gt; aggregateWindow(every: 10m, fn: mean) 
 |&gt; keep(columns: [“_value”, “_time”]) 
 |&gt; fill(column: “_value”, value: 0.0) // fill nulls (in the future) with 0 
 |&gt; map(fn: (r) =&gt; ({_time: r._time, _real: r._value})) // map value column to new name
WINDSPEED_FORECAST = from(bucket: “meteo_forecast”) 
 |&gt; range(start: dashboardTime, stop: upperDashboardTime) 
 |&gt; filter(fn: (r) =&gt; r._measurement == “WindSpeed”) 
 |&gt; keep(columns: [“_value”, “_time”]) 
 |&gt; aggregateWindow(every: 10m, fn: mean) 
 |&gt; fill(column: “_value”, usePrevious: true) 
 |&gt; fill(column: “_value”, value: 0.0) 
 |&gt; map(fn: (r) =&gt; ({_time: r._time, _forecast: r._value}))
RES = join(tables: {curr: WINDSPEED, forecast: WINDSPEED_FORECAST}, on: [“_time”], method: “inner”) 
NUMERATOR = RES |&gt; map(fn: (r) =&gt; ({_time: r._time, _diff: math.abs(x: r._forecast – r._real)})) |&gt; sum(columns: [“_diff”]) 
DENOMINATOR = RES |&gt; count(columns: [“_real”]) 
MAE = join(tables: {num: NUMERATOR, denom: DENOMINATOR}, on: [“_start”], method: “inner”) 
 |&gt; map(fn: (r) =&gt; ({_value: float(v: r._diff) / float(v: r._real)})) 
MAE |&gt; yield()

WINDSPEED = from(bucket: “meteo”)

|> range(start: dashboardTime, stop: upperDashboardTime) // same window from which we have forecast data

|> filter(fn: (r) => r._measurement == “WindSpeed”)

|> aggregateWindow(every: 10m, fn: mean)

|> keep(columns: [“_value”, “_time”])

|> fill(column: “_value”, value: 0.0) // fill nulls (in the future) with 0

|> map(fn: (r) => ({_time: r._time, _real: r._value})) // map value column to new name

WINDSPEED_FORECAST = from(bucket: “meteo_forecast”)

|> range(start: dashboardTime, stop: upperDashboardTime)

|> filter(fn: (r) => r._measurement == “WindSpeed”)

|> keep(columns: [“_value”, “_time”])

|> aggregateWindow(every: 10m, fn: mean)

|> fill(column: “_value”, usePrevious: true)

|> fill(column: “_value”, value: 0.0)

|> map(fn: (r) => ({_time: r._time, _forecast: r._value}))

RES = join(tables: {curr: WINDSPEED, forecast: WINDSPEED_FORECAST}, on: [“_time”], method: “inner”)

NUMERATOR = RES |> map(fn: (r) => ({_time: r._time, _diff: math.abs(x: r._forecast – r._real)})) |> sum(columns: [“_diff”])

DENOMINATOR = RES |> count(columns: [“_real”])

MAE = join(tables: {num: NUMERATOR, denom: DENOMINATOR}, on: [“_start”], method: “inner”)

|> map(fn: (r) => ({_value: float(v: r._diff) / float(v: r._real)}))

MAE |> yield()

Some Remarks

The primary drawback we’ve encountered is the trade-off between functionality and simplicity, at least to the casual, non-technical user. While we see InfluxQL in Grafana as a piece of functionality that is easily taught to non-technical people in an industrial context, this is much less the case with Flux.

Members of the community can take steps in mitigating this drawback by providing examples, such as the ones described in this post, which others can build on. Furthermore, the extensibility of Flux will make it possible to build predefined functions to cover the most common use cases encountered in the manufacturing world.

We’re sure we will encounter more use cases for Flux as functionality becomes more readily available and the language becomes more widely supported.

Feature image from Pixabay.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

May 29, 2025 by Anh Hoang

Update How InfluxDB and Flux Gather Meaningful Insights from Time Series Data

Read more about How InfluxDB and Flux Gather Meaningful Insights from Time Series Data at Wikipedia

The Context

Examples of Flux and Math Across Measurements in Industry

Use Case 1: Calculating Pressure Drop Across a Filter

Use Case 2: Sensor Values in the Context of a Machine State

Use case 3: Calculating the Error Between a Forecast and the Measured Value

Some Remarks

AI Automation for Business in 2025: A Step-by-Step Guide

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2017

Why Your Business Needs a Mobile App Rather Than a Website

7 Questions To Ask Yourself Before You ‘App’ | Entrepreneur

Homestays Marketplace Application Development

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2023

AI Automation for Business in 2025: A Step-by-Step Guide

Top 10 Offshore Development Companies (ODCs) in 2025

How can businesses effectively integrate AI into their operations?

Locations

Read more about How InfluxDB and Flux Gather Meaningful Insights from Time Series Data at Wikipedia

The Context

Examples of Flux and Math Across Measurements in Industry

Use Case 1: Calculating Pressure Drop Across a Filter

Use Case 2: Sensor Values in the Context of a Machine State

Use case 3: Calculating the Error Between a Forecast and the Measured Value

Some Remarks

Get a custom Proposal

You need to enter your email to download

Blog post

Locations