Background - often the first thing when we appraise acreage, is to generate a map and look at production in the area of interest. Today we may go further with a quick-n-dirty ML run and get estimate of cumulative production or CUM (over some time period of economic interest). Based on feedback from the AAPG ML and Analytics Workshop, January 2019, one popular request is to create a community of sharing what we have collectively learn from applying ML to our work.
We can share things we learn over time (e.g., adding "white noise" makes predictive deconvolution robust in processing poor signal-to-noise land data), without compromising intellectual property (algo / code). So here is an example.
Observations - based on applying deep neural network (DNN) to production data in Permian:
1. DNN performs well on numeric features. Better with indicator (categorical). Best with embedding (fancy way of saying "encoding" categorical features).
2. DNNLinear* (combo of DNN and Linear regression) is no guarantee of improved performance. Surprised. Performance may be dataset dependent.
3. Overall improvement comes from embedding (categorical) and dimension reduction.
*specially refer to Tensorflow Estimator (
https://www.tensorflow.org/guide/estimators) Now, anticipate some may question is there really quick-n-dirty ML?
For a qualified yes, come to Energy IN Data @Austin this June, in particular, "Self-service machine learning" from Microsoft in the Breakout Session: Solving Your Problem, June 17, 2019.
For more insights, actionable and deployable, see
https://energyindata.org/Program.