How WaterBridge Uses TimescaleDB for Real-Time Data Consistency
This is an installment of our “Community Member Spotlight” series, in which we invite our customers to share their work, spotlight their success, and inspire others with new ways to use technology to solve problems.
In this edition, Chris Morris, SCADA (Supervisory Control and Data Acquisition) Manager at WaterBridge, an Oil & Gas production water reuse and disposal company, shares how his team is using TimescaleDB. They ingest up to 10,000 data points per second and display them in real time to ensure operational efficiency and safety. Using TimescaleDB as a data historian allows the WaterBridge team to do real-time monitoring and alerting, with plans to implement predictive maintenance models in the future.
About the Company
WaterBridge is a company specializing in hydraulic fracturing water treatment and disposal. It provides crucial infrastructure to manage, treat, and dispose of wastewater generated from hydraulic fracturing operations. The company operates in the U.S. across Texas, New Mexico, and Oklahoma, processing large volumes of water while adhering to strict safety and environmental standards.
To comply with these strict standards and prevent leaks, the WaterBridge team relies on data from measurement/communication devices placed throughout the pipeline, which monitor water pressure, flow, temperature, and process changes. The ability to ingest and display this data in real time is crucial for the team’s control room to function properly and ensure operational safety—this is where TimescaleDB comes in.
About the Team
I’m Chris Morris, the SCADA (Supervisory Control and Data Acquisition) Manager at WaterBridge. I oversee the control system operations and database management. The team focuses on real-time monitoring, data ingestion, and analytics for WaterBridge’s hydraulic fracturing water treatment and disposal operations. We’re a small team of three, so we try to use open-source software as much as possible. We may do some customizations here and there, but the majority of the tools we use are off the shelf.
About the Project
Traditionally, oil and gas companies have supplied water for fracking and the disposal of production water via hundreds of trucked-in/out loads. This lends to increased environmental, safety, and financial impacts to both the community and the production company. WaterBridge constructs and operates a network of 1200+ miles of piping to offset these inherent impacts due to Oil & Gas production.
The team in the control room, who are working 24/7, is the primary customer for the applications that are built. We have a subway-type map of the pipes with real-time data being fed into it from our SCADA system. The control room team uses this real-time data for remote monitoring to try to catch leaks, start pumps, stop pumps, and see if they have tanks that are going over, etc. We also have an alarm screen for the control room. This is where the control room team gets alarms to know if the equipment has lost communication.
“TimescaleDB is absolutely critical to our operations”
If something is off, our actions depend on the facility's location and normal throughput. In a large facility, we’re doing 24,000 barrels of water per day. If that one were to go down, we’d immediately dispatch somebody. For other facilities doing 10 barrels per day, we would wait until the next shift.
TimescaleDB is absolutely critical to our operations. The control room team relies on real-time data consistency, so data must be ingested and displayed immediately as it becomes available. We track close to a million metrics and ingest 5,000-10,000 data points per second into TimescaleDB using MQTT and OPC-UA through Chariot MQTT Servers and Ignition SCADA. We use tools like PowerBI, Seeq, and Spotfire to visualize this data and generate anomaly reporting.
Choosing (and Using!) TimescaleDB as a Data Historian
Before Timescale, we were on a managed SQL Server instance that was slowly degrading over time. The tier we were at topped out at 4 TB; there was no compression built into it. While Ignition has some ways to auto-roll data, deleting it after a set number of days, we didn’t want to do that. Our hope was to have real-time data and use it for training models in the future.
Going up the next tier almost tripled the cost, and we were already at $12,000 a month. At that time, all we provided were very basic trends for the control room. Those trends were very slow, and we couldn’t really do anything with the data to interact with it.
“The appeal of Timescale was that we only needed to communicate with PostgreSQL”
So, we started looking to see what our options were. We looked at Timescale, ClickHouse, and a product that’s specific to our space called Canary, which is pretty popular with people who use Ignition in the oil and gas industry. But it doesn’t have a regular database, and you can’t query against it; you need a specific API, which makes interacting or integrating with other programs a no-go.
ClickHouse looked interesting, but we would have to write a custom driver to interface with Ignition. The appeal of Timescale was that we only needed to communicate with PostgreSQL. Real-time ingestion is critical for us: the control room trends metrics in real time. Therefore, as soon as data comes in, it must be displayed on the screen without delay, regardless of query execution time.
TimescaleDB is the only historian we have—that’s how we use it. We can’t have any kind of eventual consistency, and we can tolerate some outages on the historian as long as we’re still bringing in our real-time data. Ignition splits the data between real time and historic data. Before using TimescaleDB, we had to be more discerning about the change in values to even historize it. Now, we just throw everything in there.
In terms of features, continuous aggregates have been a huge help. The issue we initially ran into when we started having all this data available was that everyone [other teams at WaterBridge] thinks they want real-time data, but they actually don’t. And their programs aren’t set up to handle pulling in 100 GB for the past month.
Being able to create continuous aggregates for them that downsample the data to five minutes or three minutes and stay updated nonstop without us having to do it manually has been incredibly helpful.
We haven’t had any reliability issues with Timescale or any other issues that aren’t already in the documentation.
Future Plans: Predictive Maintenance
Last year was about getting good historical data and something that we could actually use. This year has been about laying a foundation for getting some machine-learning models next year. Currently, we are working on a pipeline simulation project.
We have geographic data for all our assets, and we want to produce a pipe model to show what the system should be doing hydraulically. To do that, the team is using just the past five minutes of data, so TimescaleDB’s skills play into this. The plan is to generate some models for predictive maintenance to help with leak detection or efficiency—TimescaleDB will provide that historical data, too.
We’d like to thank Chris and the team at WaterBridge for sharing their story on how they use TimescaleDB for real-time monitoring and benefiting from its real-time data consistency.
We’re always keen to feature new community projects and stories on our blog. If you have a story or project you’d like to share, reach out on Slack (@Ana Tavares), and we’ll go from there.