Giter Site home page Giter Site logo

bcebere / elastic-surv Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 0.0 79 KB

Survival analysis for Big Data

License: BSD 3-Clause "New" or "Revised" License

Python 22.36% Jupyter Notebook 77.64%
survival-analysis elasticsearch bigdata automl hyperband coxph deephit

elastic-surv's Introduction

elastic-surv

Survival analysis on Big Data

elastic-surv Tests License

elastic-surv is a library for training risk estimation models on ElasticSearch backends. Potential use cases include user churn prediction or survival probability.

  • ๐Ÿ”‘ Survival models include CoxPH, DeepHit or LogisticHazard(pycox).
  • ๐Ÿ”ฅ ElasticSearch support using eland.
  • ๐ŸŒ€ Automatic model selection using HyperBand.

Problem formulation

Risk estimation tasks require:

  • A set of covariates/features(X).
  • An outcome/event column(Y) - 0 means right censoring, 1 means that the event occured.
  • Time to event column(T) - the duration until the event or the censoring occured.

The risk estimation task output is a survival function: for N time horizons, it outputs the probability of "survival"(event not occurring) at each horizon.

Installation

For configuring the ELK stack, please follow the instructions here.

The library can be installed using

$ pip install .

Sample Usage

For each ElasticSearch data backend, we need to mention:

  • the es_index_pattern and the es_client for the ES connection.
  • which keys in the ES index stand for the time-to-event and outcome data.
  • optional: which features to include from the index.
from elastic_surv.dataset import ESDataset
from elastic_surv.models import CoxPHModel

dataset = ESDataset(
    es_index_pattern = 'churn-prediction',
    time_column = 'months_active',
    event_column = 'churned',
    es_client = "localhost",
)

model = CoxPHModel(in_features = dataset.features())
    
model.train(dataset)
model.score(dataset)

For this example, we use a local ES index, churn-prediction. This can be generated using the following snippet

from pysurvival.datasets import Dataset
import eland as ed

raw_dataset = Dataset('churn').load() 

ed.pandas_to_eland(raw_dataset,
                  es_client='localhost',
                  es_dest_index='churn-prediction',
                  es_if_exists='replace',
                  es_dropna=True,
                  es_refresh=True,
) 

Tutorials

Tests

Install the testing dependencies using

pip install .[testing]

The tests can be executed using

pytest -vsx

elastic-surv's People

Contributors

bcebere avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.