PyData 2016 - London

All the information needed to accompany the presentation

The presentation will be done through the medium of Jupyter notebooks. Below are different notebooks for varying staging of completeness. I recommend trying to complete everything yourself as we go along, but if you get stuck or lost then you can jump to a more completed work book.


Download here

Designed to be roughly similar to DCM path to conversion data. Each user has an id and between one and two rows. The first row in time will always be when a user saw an advert. If a user converts then they have a second row for conversion time. Users who never convert will have only one row.


Base workbook - start here

Data Cleaning Done

pymc 1

pymc 2

pymc 3

Cox Done Full Version

Libraries used:

We will use three non-standard libraries, pymc a Bayesian library, lifelines a library for survival analysis and pyBMA a library for Bayesian model averaging.

lifelines - pip install lifelines

pymc - pip install pymc

pyBMA - pip install pyBMA

After the talk

I’ve tried to provide fuller information on everything we go through in the presentation on this site, so if you would like to learn more about anything, please read through. There are also links to relevant academic papers if you wish to really dig in.

Failing that, if you have any questions or issues please feel free to contact me at jakecoltman @