Predicting Oil Fields Using Solids DNA

Key results

AI-powered oil field prediction
Reduced oil field exploration costs

About the Client

The client is a privately held exploration and production oil company that focuses on acquisitions, exploration, and development in the USA and Holland.

Business Challenge

Geologists and geophysicists use different methods to search for geological structures that may form oil reservoirs.

Currently, it costs around $85,000 per square mile for an oil and gas company to check a field for oil.

All around, they spend at least $1M and possibly over $40M before they see any results.

Our client had DNA samples of solids from 13 different areas that contain more than 3,000 hundred fields with microelements and characteristics for each area.

The client wanted to make predicting oil fields easier thanks to solids DNA.

Solution Overview

Quantum found a way to predict oil fields by using the DNAs of solids, saving time and money for the client.

Our machine learning model predicts the location of an oil field with a 70% accuracy.

And as soon as the client collects more data about different areas, our R&D team will improve the result.

Project Description

Data understanding and preparation

Our research began with an attempt to highlight the main features of solids DNA with an analytical approach. We had to determine the importance of fields. For feature selection, we applied regression, statistical and other methods to all of our mixed data to get relevant results. And as we mixed data in different ways, we created new datasets.

ML model

Our team used mixed datasets for ML model training. As usual, we started with different algorithms to find the best one, but on the first iteration, we couldn’t tell the difference between 13 groups of data and got only 50% of accuracy.

After we discussed the problem with the client, we made a reasonable conclusion: the fact that all data was taken from different regions and in different seasons was to blame. We split the datasets according to this principle, thus improving the result by 20%.

Visualization

To show more than just a thousand of lines with thousands of fields, we decided to show each field as a group of pixels that could change their saturation depending on the importance of certain features. This method showed us a full image we could analyze.

Technological Details

We used the FloydHub platform for data processing and model training. Scikit-learn, SciPy, Matplotlib, and Seaborn were used for EDA and visualization.

We also developed an automated pipeline to find the algorithm for selecting the most valuable feature. As we were looking through algorithms, we used a bunch of approaches from correlation and stacks of L1 regularized regressors to unsupervised approaches and dimensionality reduction methods. All results were saved as separate datasets.

Further model selection was also done automatically by specifying the necessary models and evaluating them against each dataset. The most “valuable” datasets were then selected to build more accurate models by fine parameter tuning and building sophisticated models like DNN and gradient boosting.

The libraries we used:

pandas, NumPy for data manipulation and visualization
Scikit-learn for data analysis and processing, feature selection and clusterization
Scikit-feature for unsupervised feature selection
SciPy for data analysis
Matplotlib, Seaborn for visualization
XGBoost for gradient boosting
Keras for DNN
hyperopt for parameter tuning
imblearn for dealing with imbalanced data

KPIs

3,000+
fields with DNA samples analyzed
70%
accuracy in predicting oil field locations

Location

Industry

Oil & Gas

Services

AI & Machine Learning
Data Analytics
Data Engineering

Technologies

Python
Keras
Scikit-learn
xgBoost
Numpy
FloydHub
SciPy
MatPlotLib
Seaborn

Related Case Studies

CanadaInfrastructure

Modernizing Rail Infrastructure Monitoring: A Scalable AI Pipeline for Drone-Based Photogrammetry

An AI consulting engagement that validated photogrammetry technologies for rail monitoring, delivered cost-optimized pipeline architecture, and enabled scalable drone-based infrastructure analytics.

IsraelFinancial Services

Transforming Investment Analysis with AI-Driven Insights on Public Companies

An LLM-based framework that automates macro-level financial research, cuts manual data analysis time by 75%, and scales real-time coverage to over 9,000 global companies.

IndonesiaE-commerce

Optimizing E-commerce Conversion with a Proactive AI Sales Agent

A proactive AI sales agent transforms e-commerce operations by automating 70% of routine queries, improving customer reach with 24/7 multilingual support.

Connect with our experts

Get in touch

Predicting Oil Fields Using Solids DNA

Key results

About the Client

Business Challenge

Solution Overview

Project Description

Technological Details

KPIs

3,000+

70%

Location

Industry

Services

Technologies

Drone navigation solution for automated wind station inspection

LLM-based financial investment advisory chatbot

Transaction monitoring and suspicious data detection solution

Modernizing Rail Infrastructure Monitoring: A Scalable AI Pipeline for Drone-Based Photogrammetry

Transforming Investment Analysis with AI-Driven Insights on Public Companies

Optimizing E-commerce Conversion with a Proactive AI Sales Agent

Drone navigation solution for automated wind station inspection

LLM-based financial investment advisory chatbot

Transaction monitoring and suspicious data detection solution

Modernizing Rail Infrastructure Monitoring: A Scalable AI Pipeline for Drone-Based Photogrammetry

Transforming Investment Analysis with AI-Driven Insights on Public Companies

Optimizing E-commerce Conversion with a Proactive AI Sales Agent

Drone navigation solution for automated wind station inspection

LLM-based financial investment advisory chatbot

Transaction monitoring and suspicious data detection solution

Connect with our experts

North America

Middle East

Central Europe