ymoch / apyori Goto Github PK

A simple implementation of Apriori algorithm by Python.

License: MIT License

Python 100.00%

apyori's Introduction

Apyori

Apyori is a simple implementation of Apriori algorithm with Python 2.7 and 3.3 - 3.5, provided as APIs and as commandline interfaces.

https://coveralls.io/repos/github/ymoch/apyori/badge.svg?branch=master

Module Features

Consisted of only one file and depends on no other libraries, which enable you to use it portably.
Able to used as APIs.

Application Features

Supports a JSON output format.
Supports a TSV output format for 2-items relations.

Installation

Choose one from the following.

Install with pip pip install apyori.
Put apyori.py into your project.
Run python setup.py install.

API Usage

Here is a basic example:

from apyori import apriori

transactions = [
    ['beer', 'nuts'],
    ['beer', 'cheese'],
]
results = list(apriori(transactions))

For more details, see apyori.apriori pydoc.

CLI Usage

First, prepare input data as tab-separated transactions.

Each item is separated with a tab.
Each transactions is separated with a line feed code.

Second, run the application. Input data is given as a standard input or file paths.

Run with python apyori.py command.
If installed, you can also run with apyori-run command.

For more details, use '-h' option.

Samples

Basic usage

apyori-run < data/integration_test_input_1.tsv

Use TSV output

apyori-run -f tsv < data/integration_test_input_1.tsv

Fields of output mean:

Base item.
Appended item.
Support.
Confidence.
Lift.

Specify the minimum support

apyori-run -s 0.5 < data/integration_test_input_1.tsv

Specify the minimum confidence

apyori-run -c 0.5 < data/integration_test_input_1.tsv

apyori's People

Contributors

Stargazers

Watchers

Forkers

pombredanne dauth afshinrahimi unkinddragon johntekdek charleypeng1 ishitab bmritz dpbac steveyeh987 zaxr chonglu2 zpencerguy devbudhathoki dayawan123 srk86386 ch-hristov oleksandr82 afcarl xzgithu abdullahsusuzz starte utkuyucel ltybcyfhbr pawan-1493 ewillyliew sukasukafarhan oaes2006 anhuaxiang dar1900 1jasonzhang zhaohuixin dkamci knut0815 finesure2017 terminal0gr airysen jaysonjc2 aly-shmahell aitormg mertmr yasmeen027 rahulchamarthi algarcia89 kenny007 trimeta skrstv123 prathee13 theafricanquant gogoeleven li59135016 vinterrobertt mhannani snorlaxzzz jlin-data gollino aronhack gauravddhanwant rishi-bethi jrballe sadikbarun xxoox168 ermakovvova kalbaliev hendrikmeth123 jennansowayan murataltuntas iozbakar bmbdevp world4jason wyf0912 abhinv98 sulakonur rachidphp pwarias arsalanliaqat naughtybabyfirst oliviermizero vishalbelsare quocthangk64 angelinadergacheva mehmetcantozlu jacksonmilner zking-godaddy franciscosales1968 ilaydadelal talhayusuf1 hostel290 bnymin fjqmqjrm zaronisoftware

apyori's Issues

關於數據方面的問題

你好，如果我的數據中，每筆訂單同樣商品出現超過1次，那在應用這個套件，他會只算出現一次嗎？

Rules that appears more than once

Hey there,

I'm trying to understand the output of the Apyori algorithm. When I'm running the algorithm, it appears that the same rules are appearing more than once, with different Support-Confidence & Lift.

Is it normal, and if it is, what does it mean exactly?

Rules with same products on inverse order

Hi!

Thank you by the code, looks like very good!

My point is:
It's normal appear rules with same products on inverse order?

For example:

Base_Item | Append_Item | Support | Confidence | Lift
Rice | Egg | 0.059533 | 0.911369 | 7.44109
Egg | Rice | 0.059533 | 0.486073 | 7.44109

And if it's normal, why this happen? and why the Confidence is different?

Thanks!!!!

Tuning parameters (min_support, min_conf, min_lift) in other datasets

Thanks for the great library! When I was trying to adopt this method on other datasets, such as the census_income from UCI (https://archive.ics.uci.edu/ml/machine-learning-databases/adult/), I found it difficult to set the parameters to get feasible rules. Do you have any advice on this?

Thanks!

Relevance of the rules

Hi,

Can I please know if the rules returned from the algorithm are sorted by any measure? Or if there's a method to retrieve the top-n most relevant rules?

Rules with Multiple-item Consequents

It seems like the items_add contains only one item. e.g (1,2,5) --> (3,4) will be ignored even if this rule satisfies both minsup and minconf requirements.

RelationRecord输出格式是什么？

您好，这个仓库很棒，但就是输出格式完全看不懂，ordered_statistics.......这是什么意思啊？

Rules for minimum length of items

From earlier tutorials of apyori that are available online, it looks like there was a rule for a minimum length of items that should be considered. It would be great to have this feature back in place.

My understanding is that would also reduce the computational burden when the ultimate goal is to find the transaction patterns for a minimum number of items.

antecedent and antecedant

If you run

    from mlxtend.frequent_patterns import apriori
    from mlxtend.frequent_patterns import association_rules
    dataframe = apriori(dataframe, min_support=min_support, use_colnames=True)
    dataframe = association_rules(dataframe, metric="lift", min_threshold=min_threshold)
    dataframe['antecedents_length'] = dataframe['antecedents'].str.len()
    dataframe['consequents_length'] = dataframe['consequents'].str.len()

In pycharm sciView the column name in antecedEnt while in jupyter notebook name is antecedAnt, is the column name specific to platform? Obviously this breaks code. E and A varies.
I ran exactly same code on same dataframe and got column names different

Same Antecedent but different support result

Hi everyone,

I have a strange output where the same antecedent with different consequent has different support values. Isn't support only calculated with antecedent frequency on the dataset?

This is my output:

`RelationRecord(items=frozenset({'1 seg_actrab', '0 seg_auto'}), support=0.02740837419309323, ordered_statistics=[OrderedStatistic(items_base=frozenset({'1 seg_actrab'}), items_add=frozenset({'0 seg_auto'}), confidence=0.5803921568627451, lift=3.41041294536262)])

RelationRecord(items=frozenset({'1 seg_actrab', '1 seg_outros'}), support=0.012584218138205934, ordered_statistics=[OrderedStatistic(items_base=frozenset({'1 seg_actrab'}), items_add=frozenset({'1 seg_outros'}), confidence=0.2664799253034547, lift=3.2735085700043927)])`

Upload to anaconda

Please upload the package to anaconda as well.

Sort the rules found by confidence

Hi
After I have found the rules with a minimum support and a minimum confidence, I would sort the rules found by confidence, is it possible?

It is the code that I wrote
rules = list(apriori(dataset, min_support=0.003, min_confidence=0.3, min_lift=1.01))
Ty

How to use apyori package on external data on ipython notebook?

Hi,
I am looking to use the apyori package to do some association rule mining on the attached data set. Can you please tell me how to use this package to load the data and work on an ipython notebook? The instructions provided are for command line interface and I am unable to load the data itself.

Can you tell me where am I going wrong?

Regards,
Jash
data_demo.zip