There are many terms and definitions thrown around in the AI and ML space that it can be confusing even to trained data scientists. We have used many of these terms around our website. It may be helpful to define these terms to ensure we’re clear on all of the meanings.
The term “Model” is one of the most confusing. It can mean many different things in different contexts. We can model a pump, well, design, flow or geophysical layer in a formation. Model in general terms just means a representation of something. For the purpose of data science in the oil field, a Model is best defined as a mathematical algorithm that replicates (or models) a decision process to enable automation or understanding. It can be made up of multiple models all working together which we call an Ensemble Model. It makes a decision or recommendation based on data inputs that can then be used to better automate a process. Ideally, the model should also reveal the rationale behind its decision to help interpret the decision process.
Books, movies, and websites throw around the term Artificial Intelligence or AI, sometimes associated with robots and doomsday warnings. But, it just defines any computer or system that in some way exhibits human-like intelligence and can perceive, analyze, learn, and process as we do. It is often used as a synonym for the broader definition, Artificial General Intelligence which is the level of intelligence required to act like a human being.
For the purpose of data science in the oil field, we are using AI to refer to “Narrow AI” which is a single purpose intelligence. It replicates a specific decision an expert would make if they were provided that same information. It has human-like intelligence but in one narrow way.
Machine Learning is sometimes used as a synonym for AI and vice versa, but it really defines the method of using data to train models without explicit programming. Instead of trying to program every rule, we use the data to teach the model to learn the rules. Machine learning is the most common method of training AI models, especially for well operations in the oilfield.
Here are just a few of the most Common Algorithms used in data science: Artificial Neural Networks and Recurring Neural Networks are the most similar to the way our brains work learning to weigh different data inputs and combinations similar to the way neurons work in our brain. Random Forests, Support Vector Machines, Gradient Boosted Trees are popular machine learning algorithms that use decision trees, linear modeling and error reduction techniques to maximize the likelihood.
Supervised Learning defines the process of using labeled data or labels to train models.
Labels refer to the marking of historical data with the type of problem, solution, action or condition we want the model to infer from the data.
Supervised learning is different from unsupervised learning which learns without labels. Unsupervised learning tries to use data alone to detect problems by comparing the operation of the well with its performance to discover anomalies and improvements with limited information from operators or engineers.
At OspreyData, we tried various methods of unsupervised learning without success. Supervised Learning has yielded the best performance by incorporating extensive expertise into machine learning to take advantage of the accumulated knowledge of our experts.
To learn more about AI/ML models, read our white paper entitled “Expert-Guided Machine Learning: Engaging Petroleum Experts’ Know-How in Petroleum Engineering and Oilfield Operations.” Click here to access the white paper.