Collaboration of Operators and Data Scientists Is Key

A major challenge has been to foster cooperation between two groups with such different skills and backgrounds, O&G operation and data science. Failure of many projects is due to failure of collaboration of these groups to understand one another to achieve shared objectives. Data science often thinks models can be built from data and operators will use them. Operators think data science cannot understand the complexity of their operation. Management thinks they can simply give data to data science and operational improvements will ensue. All are wrong.

Expert Guided Machine Learning (EGML) assumes these groups need to be encouraged (sometimes forced) to work together. AI/ML models need data science and IT to oversee them, operations and engineering to train them, and management to direct and reward everyone. EGML combines process and platform to facilitate the cooperation of these groups.

Collaboration Platform

A collaboration platform provides an integrated system that taps into all of the well, pump, sensor, reservoir and operational data. It provides intuitive navigation and visualization to support data science, operations, engineering and management. It is not just data automation or SCADA. It is designed for the training and execution of advanced AI/ML models including the associated event logs, work overs, labels, tags and comments from operators and engineers. It allows data science to orchestrate the training of these models and transition from training to production. It monitors the performance of these models to indicate when retraining and revalidation is required.

Most importantly, the platform provides a common environment for collaboration for all parties to share information and exchange ideas. Failures and sub-optimal events can be reviewed through a common lens with each party reviewing and commenting on different events, labels and tags. All parties can review the results of models in training and production along with underlying performance statistics (e.g. accuracy, precision, recall) and associated features and characteristics that trigger the models. Management can see these same results in context of field operation, production and P&L.

Having a common reference, measurement and knowledge base, the platform enables disparate teams from different locations to work together towards a common objective. The importance of detailed failure logs, timely diagnosis and accurate labels becomes visible to operations as they review the results of the models trained on this information. The complexity of well operation and troubleshooting becomes visible to the data science team as they interact with and review the discussion and analysis from operations. The input and decisions from both sides is recorded in context of the state of each well at the time something happens. Both disciplines gain respect and confidence in the expertise and guidance of their counterparts. Finally, management can monitor the progress of these cross functional teams down to individual models by reviewing consistent, measurable performance statistics.

Facilitating Cross Function Processes

A collaboration platform enables cross functional processes that orchestrate the interdisciplinary work ow of EGML. The process defines the steps, responsibilities, and touchpoints of operators, engineers and data scientists as they work together to build and improve AI/ML models. Historical problems and events are logged and tagged by operators and reviewed by engineers so that data science can transform them into the labels and data transformations used to train models. The models are executed passing results back to operators for further review and comment, creating more labels, and the cycle continues. In between, there can be questions, answers, comments, and chats associated with each model, well and event. Each of these interactions may trigger a flow of information and interaction that is facilitated by the shared platform.

As models prove consistent and accurate, many model predictions and recommendations can be “auto validated”. Auto validated means the model and associated response is so predictable and measurable it can be automatically confirmed without human review.

For example, a recommended change in gas injection is implemented and production increases in line with forecast. No need for anyone to review these results since the platform was able to automatically confirm the increase in gas injection and oil production.

On the other hand, a potential tubing failure prediction continues with an expected drop in production creating an automated notification for an operator to assess and confirm, conducting a pressure test if warranted. The result of the pressure test will be entered to validate or invalidate the model’s prediction. This confirmation provides more labels and performance measurements for ongoing training. The process determines whether each event can be validated automatically, routing processing appropriately, and recording each step and result.

EGML orchestrates both automated and manual steps to ensure that operations and engineering expertise is effectively incorporated into continuous model training. EGML highlights specifically when and where this human input is needed to avoid overburdening experts with review activities. EGML ensures that management and operations define their criteria so that data science optimizes algorithms to align model performance with business objectives. Management and operations drive the objectives ensuring the effort of expert augmentation is applied wisely, and AI/ML is accepted and embraced by operators.

To learn more about the collaboration of operators, managers, and data science into the creation of AI/ML models, access our white paper entitled “Expert-Guided Machine Learning: Engaging Petroleum Experts’ Know-How in Petroleum Engineering and Oilfield Operations” here.

Related Articles