Characterizing Natural Hydrogen Occurrences in the Paris Basin Using
OCR-Enhanced Well Database Studies
Abstract
This study investigates natural hydrogen (H2) occurrences in the Paris
Basin, using Optical Character Recognition (OCR) technology to analyze
an extensive, yet historically underexploited, well database that
contains older drilling records. With the growing demand for carbon-free
energy, natural hydrogen, produced through processes like
serpentinization and water radiolysis, offers a promising alternative to
fossil fuels. However, its potential has been largely unexplored in
conventional oil and gas wells. Utilizing the BEPH (Office of
Exploration and Production of Hydrocarbons) French database, which
includes well logs, mudlogs, and End Drilling Reports (EDRs) in PDF
image format, we applied the Tesseract-OCR Engine to convert these
documents into searchable formats for efficient data analysis. Our
analysis revealed several H2-bearing wells across the French sedimentary
basins. The hydrogen occurrences in the Aquitaine Basin correlate with
the geological context, but those in the Paris Basin present an anomaly,
as their H2 occurrences do not align with the expected geological
factors. In the Paris Basin, H2 has been detected in four main
formations: the Lusitanian aquifer, Dogger aquifer, Triassic aquifer,
and the basement. The highest hydrogen concentration (52 vol%) was
found in the Dogger formation. These wells are primarily located along
the Bray fault and thrust, indicating a geological influence on H2
distribution. This research demonstrates the effectiveness of OCR in
reprocessing historical drilling data for natural hydrogen exploration,
highlighting the need for comprehensive exploration methodologies in
this emerging field.