biomedsciai / causallib Goto Github PK

View Code? Open in Web Editor NEW

663.0 21.0 93.0 13.16 MB

A Python package for modular causal inference analysis and model evaluations

License: Apache License 2.0

Python 100.00%

causal causal-inference causal-models causality data-science machine-learning ml

causallib's Introduction

Causal Inference 360

A Python package for inferring causal effects from observational data.

Description

Causal inference analysis enables estimating the causal effect of an intervention on some outcome from real-world non-experimental observational data.

This package provides a suite of causal methods, under a unified scikit-learn-inspired API. It implements meta-algorithms that allow plugging in arbitrarily complex machine learning models. This modular approach supports highly-flexible causal modelling. The fit-and-predict-like API makes it possible to train on one set of examples and estimate an effect on the other (out-of-bag), which allows for a more "honest"¹ effect estimation.

The package also includes an evaluation suite. Since most causal-models utilize machine learning models internally, we can diagnose poor-performing models by re-interpreting known ML evaluations from a causal perspective.

If you use the package, please consider citing Shimoni et al., 2019:

Reference

@article{causalevaluations,
  title={An Evaluation Toolkit to Guide Model Selection and Cohort Definition in Causal Inference},
  author={Shimoni, Yishai and Karavani, Ehud and Ravid, Sivan and Bak, Peter and Ng, Tan Hung and Alford, Sharon Hensley and Meade, Denise and Goldschmidt, Yaara},
  journal={arXiv preprint arXiv:1906.00442},
  year={2019}
}

¹ Borrowing Wager & Athey terminology of avoiding overfit.

Installation

pip install causallib

Usage

The package is imported using the name causallib. Each causal model requires an internal machine-learning model. causallib supports any model that has a sklearn-like fit-predict API (note some models might require a predict_proba implementation). For example:

from sklearn.linear_model import LogisticRegression
from causallib.estimation import IPW 
from causallib.datasets import load_nhefs

data = load_nhefs()
ipw = IPW(LogisticRegression())
ipw.fit(data.X, data.a)
potential_outcomes = ipw.estimate_population_outcome(data.X, data.a, data.y)
effect = ipw.estimate_effect(potential_outcomes[1], potential_outcomes[0])

Comprehensive Jupyter Notebooks examples can be found in the examples directory.

Community support

We use the Slack workspace at causallib.slack.com for informal communication. We encourage you to ask questions regarding causal-inference modelling or usage of causallib that don't necessarily merit opening an issue on Github.

Use this invite link to join causallib on Slack.

Approach to causal-inference

Some key points on how we address causal-inference estimation

1. Emphasis on potential outcome prediction

Causal effect may be the desired outcome. However, every effect is defined by two potential (counterfactual) outcomes. We adopt this two-step approach by separating the effect-estimating step from the potential-outcome-prediction step. A beneficial consequence to this approach is that it better supports multi-treatment problems where "effect" is not well-defined.

2. Stratified average treatment effect

The causal inference literature devotes special attention to the population on which the effect is estimated on. For example, ATE (average treatment effect on the entire sample), ATT (average treatment effect on the treated), etc. By allowing out-of-bag estimation, we leave this specification to the user. For example, ATE is achieved by model.estimate_population_outcome(X, a) and ATT is done by stratifying on the treated: model.estimate_population_outcome(X.loc[a==1], a.loc[a==1])

3. Families of causal inference models

We distinguish between two types of models:

Weight models: weight the data to balance between the treatment and control groups, and then estimates the potential outcome by using a weighted average of the observed outcome. Inverse Probability of Treatment Weighting (IPW or IPTW) is the most known example of such models.
Direct outcome models: uses the covariates (features) and treatment assignment to build a model that predicts the outcome directly. The model can then be used to predict the outcome under any assignment of treatment values, specifically the potential-outcome under assignment of all controls or all treated.
These models are usually known as Standardization models, and it should be noted that, currently, they are the only ones able to generate individual effect estimation (otherwise known as CATE).

4. Confounders and DAGs

One of the most important steps in causal inference analysis is to have proper selection on both dimensions of the data to avoid introducing bias:

On rows: thoughtfully choosing the right inclusion\exclusion criteria for individuals in the data.
On columns: thoughtfully choosing what covariates (features) act as confounders and should be included in the analysis.

This is a place where domain expert knowledge is required and cannot be fully and truly automated by algorithms. This package assumes that the data provided to the model fit the criteria. However, filtering can be applied in real-time using a scikit-learn pipeline estimator that chains preprocessing steps (that can filter rows and select columns) with a causal model at the end.

causallib's People

Contributors

Stargazers

Watchers

Forkers

vishalbelsare hcao720 leiloong arita37 ahoyosid hrossman ashkanfa chiragnagpal aknvictor haoyuanzhang123 evi1angel bhaskers-blu-org1 counterfactuals transconnectome daveh19 tomaszsi psenin-sanofi yang9198 zixuan0810 gintian usct01 dennislwei syyunn raamana asdspal jubaer145 philipad rich-nieto ppetroskevicius stevemar andisulasikin raineydavid ronirasnic minfrdata lisong2019 yuannju itaymanes fffinale asagumo63 straymat liorness liranszlak yoavkt howardzju causal-inference-class python-repository-hub vaishnavi-muppala itsoum jarvisloh qihongchao nktoan aahmadai seif2020 sunnyyuan99 ken-arf pchakraborty1 sterlingjosh mosheraboh mmdanziger zitingtang fmigone ghostintheshellarise anhnguyendepocen reynoldsm88 tazbiulhassan sivanravidos sagipolaczek sksundaram-learning jbarsotti ruiiliuu aravind-sridhar jjandrew skpalu buddhika159 techthiyanes ericxue92 ammar257ammar shubhda09 daniel-355 jimmy-inl kgreenewald ehudkr zero506 zhan-gao rodrigomasiniai 321hg hughes-research ahmeddeladly venkataduvvuri zuernc spadehands

causallib's Issues

ModuleNotFoundError: No module named 'causallib.contrib.hemm'

Having this error while running this line in hemm_demo

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 from causallib.contrib.hemm.gen_synthetic_data import gen_montecarlo
2
3 syn_data = gen_montecarlo(5000, 2, 100)

ModuleNotFoundError: No module named 'causallib.contrib.hemm'

Any help is appreciated.

Installation for Python version 3.11

Hi. I am trying to install this library in a Python 3.11 environment. I get the following error -

ERROR: Could not find a version that satisfies the requirement causallib (from versions: none)
ERROR: No matching distribution found for causallib

However, it works for Python 3.9. Are there plans to update the library to be available for Python 3.11?

Thanks.

The parameter 'std' keeps decreasing

Thanks for your amazing work!
I had some data tested with the HEMM method and the result of subgroup prediction is abnormal. With further evaluation, I found that the parameter 'std' of the Gaussian distribution keeps decreasing and fell below zero while it supposed to converge to a positive value.
What's the cause of this phenomenon and how can I fix it? Is this about parameter initialization?

Issues with Categorical Data

So I'm working on a survey data where I am trying to figure out cause and effect relationship between a person's responses to the survey questions and his/her final preference towards that product. The data is categorical entirely. On using Causal Inference 360's evaluation plots, I got the following error:

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric precision could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric recall could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric f1 could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric roc_auc could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multi_class must be in ('ovo', 'ovr')
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric avg_precision could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multiclass format is not supported
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric hinge could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: The shape of pred_decision cannot be 1d arraywith a multiclass target. pred_decision shape must be (n_samples, n_classes), that is (1977, 3). Got: (1977,)
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric brier could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: Only binary classification is supported. The type of the target is multiclass.
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric roc_curve could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multiclass format is not supported
warnings.warn(str(v))
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:106: UserWarning: metric pr_curve could not be evaluated
warnings.warn(f"metric {metric_name} could not be evaluated")
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/metrics.py:107: UserWarning: multiclass format is not supported
warnings.warn(str(v))

KeyError Traceback (most recent call last)
in
3
4 eval_results = evaluate(ipw, X, a, y)
----> 5 eval_results.plot_all()
6 eval_results.plot_covariate_balance(kind="love");

8 frames
/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in plot_all(self, phase)
343 """
344 phases_to_plot = self.predictions.keys() if phase is None else [phase]
--> 345 multipanel_plot = {
346 plotted_phase: self._make_multipanel_evaluation_plot(
347 plot_names=self.all_plot_names, phase=plotted_phase

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in (.0)
344 phases_to_plot = self.predictions.keys() if phase is None else [phase]
345 multipanel_plot = {
--> 346 plotted_phase: self._make_multipanel_evaluation_plot(
347 plot_names=self.all_plot_names, phase=plotted_phase
348 )

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in _make_multipanel_evaluation_plot(self, plot_names, phase)
353 def _make_multipanel_evaluation_plot(self, plot_names, phase):
354 phase_fig, phase_axes = plots.get_subplots(len(plot_names))
--> 355 named_axes = {
356 name: self._make_single_panel_evaluation_plot(name, phase, ax)
357 for name, ax in zip(plot_names, phase_axes.ravel())

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in (.0)
354 phase_fig, phase_axes = plots.get_subplots(len(plot_names))
355 named_axes = {
--> 356 name: self._make_single_panel_evaluation_plot(name, phase, ax)
357 for name, ax in zip(plot_names, phase_axes.ravel())
358 }

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/mixins.py in _make_single_panel_evaluation_plot(self, plot_name, phase, ax, **kwargs)
379 plot_func = plots.lookup_name(plot_name)
380 plot_data = self.get_data_for_plot(plot_name, phase=phase)
--> 381 return plot_func(*plot_data, ax=ax, **kwargs)

/usr/local/lib/python3.8/dist-packages/causallib/evaluation/plots/plots.py in plot_mean_features_imbalance_love_folds(table1_folds, cv, aggregate_folds, thresh, plot_semi_grid, ax)
813 aggregated_table1 = aggregated_table1.groupby(aggregated_table1.index)
814
--> 815 order = aggregated_table1.mean().sort_values(by="unweighted", ascending=True).index
816
817 if aggregate_folds:

/usr/local/lib/python3.8/dist-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
309 stacklevel=stacklevel,
310 )
--> 311 return func(*args, **kwargs)
312
313 return wrapper

/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py in sort_values(self, by, axis, ascending, inplace, kind, na_position, ignore_index, key)
6257
6258 by = by[0]
-> 6259 k = self._get_label_or_level_values(by, axis=axis)
6260
6261 # need to rewrap column in Series to apply key function

/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py in _get_label_or_level_values(self, key, axis)
1777 values = self.axes[axis].get_level_values(key)._values
1778 else:
-> 1779 raise KeyError(key)
1780
1781 # Check for duplicates

KeyError: 'unweighted'

Can you help me out please?

Matching more neighbors than the number of examples in the treatment group

Hello,

If I have 10 examples in my "treated" group and 1000 in the "control" group, Is it possible to do one-side matching (match control to treatment) of more than 10 neighbors (e.g. 50)?

I tried using the "matching_mode" argument in causallib.estimation.Matching for one-directional matching, but still got the error "Expected n_neighbors <= n_samples" when using matcher.match.

Thank you!

'module' object is not callable

I try to apply several examples of causallib

=============================
%matplotlib inline
from causallib.evaluation import evaluate
import matplotlib.pyplot as plt

evaluation_results = evaluate(ipw, X, a, y)

Whenever I import evaluate from causallib.evaluation,
I always meet same error.
How I can solve this problem.

TypeError Traceback (most recent call last)
Input In [20], in <cell line: 5>()
2 from causallib.evaluation import evaluate
3 import matplotlib.pyplot as plt
----> 5 evaluation_results = evaluate(ipw, X, a, y)
6 fig, ax = plt.subplots(figsize=(6, 6))
7 evaluation_results.plot_covariate_balance(kind="love", ax=ax)

TypeError: 'module' object is not callable

Misstatement in Standardization example

In the text after cell 7, the example describes the model as a Logistic Regression model, whereas I believe the model used in the sklearn LinearRegression() model.

Source:
https://github.com/IBM/causallib/blob/master/examples/standardization.ipynb

Use case for categorical datasets

I am trying to use the library for a survey dataset where no entry is numerical and all the responses are categorical in nature. On using Causal Inference 360's evaluation plots, the results were not very encouraging, i.e. wide chasms between weighted and unweighted variables in propensity plots.

Also, Boolean Rules via Column Generation (BRCG) method didn't return any rule. Presumably because no entry was numerical. The result was this

Learning DNF rule with complexity parameters lambda0=0.001, lambda1=0.001

Initial LP solved
Iteration: 1, Objective: 0.2203
Accuracy: 0.7797356828193832
AUC: 0.5
['']

Can this library be used to find out causal relationships between categorical variables? If yes, can you share any notebook or example for the same?

weight matrix in IPW calculation can have weights info due to division by zero

weight_matrix = probabilities.rdiv(1.0) statement in this file can return inf weights if some entries in "probabilities" series are zero. Maybe there could be some way to ignore the corresponding inf weights while applying weight_matrix later on.

ipw.estimate_population_outcome(Data_X, Data_a, Data_y) has some error.

Hello.
I want to check the causal effect.

My code is

from causallib.estimation import IPW
from sklearn.linear_model import LogisticRegression

learner = LogisticRegression(penalty="l1", C= 1, max_iter=500, solver='liblinear')
ipw = IPW(learner)
ipw.fit(Data_X, Data_a)
potential_outcomes = ipw.estimate_population_outcome(Data_X, Data_a, Data_y) 
causal_effect = ipw.estimate_effect(potential_outcomes[1], potential_outcomes[0])
causal_effect

until ipw.fit(Data_X, Data_a), there is no problem.
But when I apply
potential_outcomes = ipw.estimate_population_outcome(Data_X, Data_a, Data_y)
there is an error "### /root/miniforge3/envs/python_3_9/lib/python3.9/site-packages/numpy/lib/function_base.py:550: RuntimeWarning: invalid value encountered in double_scalars
avg = np.multiply(a, wgt,"

and I check the ipw.compute_weights.
w = ipw.compute_weights(Data_X, Data_a)
This code doesn't have an error, but the values of w are all NaN.
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
..

What is the problem?

IPW computation

Briefly read the codes around IPW, in particular this line:

# weight_matrix = 1.0 / probabilities
weight_matrix = probabilities.rdiv(1.0)

My understanding of IPW is that outcomes are weighted by: $w_i = \frac{a_i}{e_i} + \frac{(1 - a_i)}{(1 - e_i)}$. do I miss something or this is a bug?

Description for each variable

Hello, thank you for providing such valuable materials.

I am writing to ask you a question on ACIC2016 dataset.
I am searching for the description for each variable of covariate data, such as mother's age, baby's head circumference, etc.
Could you let me know where I can find it?

Many thanks in advance.

Error in Example "causal_simulator.ipynb"

Hello and thank you for your amazing work!
I stumbled upon a tiny mistake in one of the examples in causallib/examples/causal_simulator.ipynb

You'll find the following first code cell:

import pandas as pd
from causallib.datasets import load_nhefs
from causallib.simulation import CausalSimulator
from causallib.simulation import generate_random_topology

which does not work properly, since the simulator is now part of dataset

to regain function just change the call accordingly:

import pandas as pd
from causallib.datasets import load_nhefs
from causallib.datasets import CausalSimulator
from causallib.datasets import generate_random_topology

Expected behavior:
import library and run cell

Observed behavior:
ImportError: cannot import name 'CausalSimulator' from 'causallib.simulation' (/opt/conda/lib/python3.10/site-packages/causallib/simulation/__init__.py)

Thanks for your time and efforts.

Fix support for scikit-learn>=1.2.0 and Numpy=1.24.0

Scikit-learn version 1.2.0 enforces two API changes that currently break tests.

LinearRegression no longer supports the normalize keyword argument, which some of the tests use.
Fix should theoretically be rather simple since it is just replacing LinearRegression with a Pipeline object with a StandardScaler preprocessing step.
Scikit-learn now enforces strict column name restrictions.
First, all columns must be of the same type, and second, column names should match between fit and predict.
This might require a solution of larger breadth.
The first part will require a "safe join" that is column-name-type aware and replace all the instances we join covariate X with treatment assignment a.
The second part require to validate column-names are consistent/preserved when new data is inputted. Which might be mostly in the time-pooled survival models where a time range is artificially created and placed as a predictor.

A slightly more minor exception was also raised with Numpy v1.24.0. Throwing a TypeError: ufunc 'isfinite' not supported for the input types exception when generating calibration plots calls matplotlib's fill_between call that fails.
Need to dig deeper into that and whether that's a causallib problem (providing bad fill values) or some external matplotlib-numpy mismatch.

In the meantime, PR #50 limited the allowed dependency versions.

ImportError: cannot import name 'PropensityEvaluator' from 'causallib.estimation'

On Google Colab, after installing causallib from PiP I try to import the PropensityEvaluator, but it fails.

!pip install causallib
from causallib.estimation import IPW, PropensityEvaluator

Collecting causallib
Downloading causallib-0.7.1-py3-none-any.whl (2.1 MB)
|████████████████████████████████| 2.1 MB 5.1 MB/s
Requirement already satisfied: numpy<2,>=1.13 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.19.5)
Requirement already satisfied: matplotlib<4,>=2.2 in /usr/local/lib/python3.7/dist-packages (from causallib) (3.2.2)
Requirement already satisfied: scipy<2,>=0.19 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.4.1)
Requirement already satisfied: networkx<3,>=1.1 in /usr/local/lib/python3.7/dist-packages (from causallib) (2.6.3)
Requirement already satisfied: statsmodels<1,>=0.8 in /usr/local/lib/python3.7/dist-packages (from causallib) (0.10.2)
Requirement already satisfied: pandas<2,>=0.25.2 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.1.5)
Requirement already satisfied: scikit-learn<2,>=0.20 in /usr/local/lib/python3.7/dist-packages (from causallib) (1.0.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (3.0.6)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (2.8.2)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib<4,>=2.2->causallib) (0.11.0)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas<2,>=0.25.2->causallib) (2018.9)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib<4,>=2.2->causallib) (1.15.0)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn<2,>=0.20->causallib) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn<2,>=0.20->causallib) (3.0.0)
Requirement already satisfied: patsy>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from statsmodels<1,>=0.8->causallib) (0.5.2)
Installing collected packages: causallib
Successfully installed causallib-0.7.1

ImportError: cannot import name 'PropensityEvaluator' from 'causallib.estimation' (/usr/local/lib/python3.7/dist-packages/causallib/estimation/init.py)

I can't understand why this happens. In my local machine, I can run this snippet without problems.

Google Colab uses Python 3.7.12

Weights_distribution plot sometimes fails due to color specification

In some cases, when running the "weight_distribution" evaluation plot this invokes a matplotlib issue matplotlib/matplotlib#19544.

Following the discussion in that thread it looks like we are passing a single color to the plot and this needs to be encapsulated in a list. Since this issue is closed and not about to be fixed (it is an unsupported undocumented feature that used to work and no longer works) we need to change this here.

PyPi package fails to install and run

Creating a python 3.6.0 virtual environment and running pip install causallib succeeds, but then trying to import any modules fails. E.g. from causallib.estimation import IPW:

...

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/core/frame.py in <module>
     86 from pandas.core.arrays.datetimelike import DatetimeLikeArrayMixin as DatetimeLikeArray
     87 from pandas.core.arrays.sparse import SparseFrameAccessor
---> 88 from pandas.core.generic import NDFrame, _shared_docs
     89 from pandas.core.index import (
     90     Index,

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/core/generic.py in <module>
     69 from pandas.core.ops import _align_method_FRAME
     70 
---> 71 from pandas.io.formats.format import DataFrameFormatter, format_percentiles
     72 from pandas.io.formats.printing import pprint_thing
     73 from pandas.tseries.frequencies import to_offset

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/io/formats/format.py in <module>
     45 from pandas.core.indexes.datetimes import DatetimeIndex
     46 
---> 47 from pandas.io.common import _expand_user, _stringify_path
     48 from pandas.io.formats.printing import adjoin, justify, pprint_thing
     49 

~/.pyenv/versions/3.6.0/envs/venv360/lib/python3.6/site-packages/pandas/io/common.py in <module>
      7 from http.client import HTTPException  # noqa
      8 from io import BytesIO
----> 9 import lzma
     10 import mmap
     11 import os

~/.pyenv/versions/3.6.0/lib/python3.6/lzma.py in <module>
     25 import io
     26 import os
---> 27 from _lzma import *
     28 from _lzma import _encode_filter_properties, _decode_filter_properties
     29 import _compression

ModuleNotFoundError: No module named '_lzma'

It looks like there are versioning issues with pandas.