Giter Site home page Giter Site logo

wellfactor's Introduction

WellFactor

WellFactor is a Python implementation of Non-negative Matrix Factorization (NMF) algorithms, specifically designed for handling incomplete data. It's a core component used within the kpd-gatech-collaboration repository and offers a flexible interface for applying NMF to large datasets.

Installation

You can install WellFactor using pip:

pip install git+https://github.com/skywalker5/wellfactor.git

Uninstallation

If you wish to uninstall WellFactor, you can do so with pip:

pip uninstall wellfactor

Please note that uninstalling the library will remove all its files and dependencies from your environment.

Key Features

WellFactor implements two key methods: NMF and PartialObservationNMF.

  • NMF: This is an algorithm that executes Nonnegative Matrix Factorization (NMF) using the Alternating Nonnegative Least Squares (ANLS) approach under the assumption that the data matrix is fully observable.

  • PartialObservationNMF: This is an algorithm that can be used when parts of the data matrix are not fully observable. This is particularly useful for handling incomplete data or missing values.

Depending on the observability of your data matrix, you can choose the appropriate method that suits your needs.

Data Input and Output

WellFactor is designed to work with data in the form of a matrix X, where each column represents a user and rows are the features of these users. For instance, X could represent a TF-IDF matrix of user's activity.

Imagine a scenario where we have 5 users and their activity is represented in terms of 3 features:

User 1 User 2 User 3 User 4 User 5
Feature 1 0.1 0.8 0.3 0.5 0.2
Feature 2 0.6 0.0 0.7 0.1 0.6
Feature 3 0.4 0.2 0.0 0.4 0.2

This matrix X is then used as an input to the NMF algorithms, resulting in two matrices W and H, that satisfy the matrix factorization equation X โ‰ˆ WH^T.

Matrix W represents cluster centers in the feature space and can be viewed as the 'basis vectors' that generate the feature representation for each user. It is interpreted as the clustering from the data-driven perspective. For example, if we factorize X into 2 clusters, we might have:

Cluster 1 Cluster 2
Feature 1 0.7 0.3
Feature 2 0.2 0.6
Feature 3 0.1 0.1

Matrix H represents how much each user pertains to each of the discovered clusters (in the W matrix). It can be interpreted as the weights of the 'basis vectors' for each user and provides a latent factor model of the users. It might look like:

Cluster 1 Cluster 2
User 1 0.2 0.8
User 2 0.5 0.5
User 3 0.3 0.7
User 4 0.7 0.3
User 5 0.1 0.9

The primary outputs of WellFactor are these matrices, W and H. The output matrix H can serve as a patient profile, useful for downstream models. This utilization is more extensively discussed in the kpd-gatech-collaboration repository.

Usage Examples

Here are some simple examples of how to use the NMF and PartialObservationNMF classes:

Using the NMF Class

from wellfactor.nmf import NMF
import numpy as np

# Initialize a random matrix
X = np.random.random((100,200))

# Set the number of factors
num_factors = 10

# Run the algorithm
model = NMF()
W, H = model.run(X, num_factors, verbose=2)

In this example, we initialize a random matrix X with dimensions 100 by 200, then set the number of factors we wish to factorize X into to 10. We then run the NMF algorithm on X using the NMF class.

Using the PartialObservationNMF Class

from wellfactor.nmf_partial_observation import PartialObservationNMF
import numpy as np

# Initialize a random matrix
X = np.random.random((100,200))

# Set some entries to 0 to simulate partial observability
fully_observed_feature_num = 40
X[fully_observed_feature_num:, list(range(10,50))] = 0

# Run the algorithm
model = PartialObservationNMF()
W, H, _ = model.run(X, 30, fully_observed_feature_num=fully_observed_feature_num, observed_idx=[0,3], verbose=2)

In this example, we simulate partial observability by setting some entries of the random matrix X to 0. We then run the PartialObservationNMF algorithm on X using the PartialObservationNMF class.

More detailed examples can be found in the examples directory.

wellfactor's People

Contributors

skywalker5 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.