Giter Site home page Giter Site logo

biolab / orange3-associate Goto Github PK

View Code? Open in Web Editor NEW
63.0 11.0 38.0 676 KB

๐ŸŠ :package: Frequent itemsets and association rules mining for Orange 3.

License: Other

Python 97.00% Cython 3.00%
frequent-itemsets frequent-pattern-mining association-rules market-basket-analysis support confidence orange apriori fpgrowth

orange3-associate's Introduction

orange3-associate's People

Contributors

ajdapretnar avatar astaric avatar jakakokosar avatar janezd avatar jerneju avatar kernc avatar markotoplak avatar primozgodec avatar rokgomiscek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

orange3-associate's Issues

Widgets need updating

Both widgets need updating due to changes in Orange.
Fix Error and Warning to classes, check QVariant and setting up AnyQt. All warnings found in the console when using the widgets.

Non-ASCII character

I installed orange3-associate and when I try out the very first example in the tutorial document I get the following error:

File "", line 1, in
File "orangecontrib/associate/init.py", line 1, in
from .fpgrowth import
File "orangecontrib/associate/fpgrowth.py", line 26
SyntaxError: Non-ASCII character '\xc2' in file orangecontrib/associate/fpgrowth.py on line 27, but no encoding declared;

Any ideas?

frequent_itemsets error in Oracle Linux 7

Associate version

Orange3-Associate_1.1.5

Orange version

Orange.version.full_version 3.7.0

Expected behavior

The command "x = frequent_itemset(list)" is supposed to return a generator object. Using the generator object, running list(x) should return a list of frequent items.

Actual behavior

Instead, running list(x) returns me with the error found in the following lines of code

" if n_items <= _BUCKETING_FEW_ITEMS:
     return None, ((frozenset(itemset), support) for itemset, support in _bucketing_count(db, frequent_items, min_support))

/miniconda3/lib/python3.5/site-packages/orangecontrib/associate/_fpgrowth.cpython-35m-x86_64-linux-gnu.so in set.from_py.__pyx_convert_set_from_py_int (orangecontrib/associate/_fpgrowth.cpp:1399)()
TypeError: an integer is required "
Steps to reproduce the behavior

Running list(frequent_itemset(list_of_transaction)) will give this error when using Oracle Linux 7. However, it works in Windows Python

Additional info (worksheets, data, screenshots, ...)

Widgets do not show on Python 3.10

Associate version

version : 1.1.9

Orange version

version : 3.32.0

Expected behavior

After inistalling and restarting orange, the addon should be shows up on widgets.

Actual behavior

Error
As you can see, I already installed Associate, but not shows up on widgets. Also, it doesn't show up on right click menu.

Basket file is loaded as numeric. How can I make it load as categorical?

I am loading a basket file into orange. This is the content of the file:

beer, milk
bread, butter
bread, butter, jelly
bread=2, butter
beer, bread

If I load this data orange loads the data as categorical as it should. As this is the datatype that allows the usage of associate rules.

categorical Data

If however I change milk to bread the data is no longer loaded as categorical.

beer, bread
bread, butter
bread, butter, jelly
bread=2, butter
beer, bread

numeric Data

Why is this? How can I can I load the data as categorical? Bear in mind that my real dataset has more than 100 columns.

Have Associate consider numeric attributes

Association Rules and Frequent Itemsets require discrete attributes to work. But often, transaction files come in continuous format (basket or pivot). Have the two widgets handle continuous data, either explicitly or by transforming them internally and raising a Warning.

Widget association-rules does not appear, Orange 3.7

Associate version

Version 3.7
gir revision: e730793

Expected behavior

I use orange3 associates in my masters work. in the orange3 3.7 version, the association rules widget does not appear. Thank you.

Actual behavior

The association rules widget does not appear.

Steps to reproduce the behavior

Install widget from Orange3.

Additional info (worksheets, data, screenshots, ...)

associate rule in Orange GUI only supports basket file?

Associate version

1.15

Orange version

3.11

Expected behavior

After learning about Orange association-rules help document in http://orange3-associate.readthedocs.io/en/latest/scripting.html , I know that 0-1 file is not supported perfectly, because it counts 0 as 'existing' items. But the function of onehot transforms 0-1 to False-True, it can get right result in python script

I try to apply association rules in Orange GUI, which is better in displaying result. I save a csv file, such as, test.csv
milk, bread, tomato, apple
True, False, True, False
False,True,False, False
...........
but it counts False as 'existing' items, which is different from python script and makes me so confused. I think it maybe improved.

Reading Orange help document

Orange supports 4 data type, Continuous, Discrete, String, Date, so it can't recognize bool type such as False-True. If I want to get right result in Orange GUI, I should use basket file, which is not so convenient.

Can't put slider on 60% on Frequent Itemsets

Associate version
Expected behavior

For my research, I wanted to put the slider for Minimal Support in the Frequent Itemsets widget on 60%.

Actual behavior

It jumps straight from 59% to 61%. I cannot select 60%

Steps to reproduce the behavior

Open the Frequent Itemsets widget and adjust the Minimal Support slider.

Additional info (worksheets, data, screenshots, ...)

image

Association Rules widget: Cannot select Max items=1 and cannot export all rules

Hi Orange team,

I'm using the Association Rules widget and having 2 following issues:

  • What's wrong?
  1. I want to find a very simple association rule with 1 antecedent and 1 consequent (i.e. rule length=2). However, the widget does not allow me to set the Min and Max items to 1. Instead, I have to set Min items to 1 and Max items to 2, but this setup creates a lot more unnecessary rules for my simple case.
    image

  2. After running the widget, I want to export and analyze all the association rules. The only way to do it is to export the results as HTML then copy it to Excel. However, the HTML report does not contain all the association rules in it (the "+1279 more" is not clickable).
    image

  • How can we reproduce the problem?
  • My data: market_basket_analysis_retailer.zip

  • My workflow:
    image

  • File widget setup:
    image

  • Association Rules widget setup: see the first screenshot in section "What's wrong?"

  • Which version of Orange are you using? How did you install Orange?

  • Version: 3.27.1

  • Installed by Anaconda: conda install -c conda-forge orange3=3.27.1

  • What is your operating system?

  • Windows 10 Education 64-bit (10.0, Build 18363)

Thank you very much and best regards,
Minh PHAN

No help/documentation on Associate widgets

Associate version

1.1.9

Orange version

3.32.0

Expected behavior

When clicking on the "?" (Show widget help) in the bottom left corner of a widget belonging to the Associate add-on, a "Help" window appears with an explanation on how to use the widget

Actual behavior

When clicking on the "?" (Show widget help) in the bottom left corner of any widget belonging to the Associate add-on, a message "No help found - There is no documentation for this widget." appears. Considering the version number of the add-on being >= 1, this is strange.

Steps to reproduce the behavior

Click on the "?" (Show widget help) in the bottom left corner of one of the two widgets belonging to the Associate add-on.

Worthless Erroneous Results from uploaded CSV file

Associate version

1.3.0

Orange version

3.36.2

Expected behavior

Frequent Itemsets and Association Rules Analyses

Actual behavior

worthless erroneous results

Steps to reproduce the behavior
  1. Load data into CSV File Import
  2. Connect to Frequent Itemsets and try analysis
  3. Connect to Association Rules and try analysis
Additional info (worksheets, data, screenshots, ...)

Orange Schema
image
Frequent Itemsets
image
Association Rules
image
Data Table
image
CSV Data Pivot like Market Basket dataset
pivot.csv
CSV Data like Foofmart 2000 dataset
dataset.csv

OWAssociate: refactor

Associate widget needs refactoring.

  • split the widget part from the model and the selection (see Orange widgets for reference)
  • check if scatter plot is used anywhere, if not, remove references
  • organize code in logical order (set_data, then commit, then send_report)
  • write tests

can't find associate module

I have installed the orange3-associate.

/Python/Orange/orange3env/lib/python3.4/site-packages/orangecontrib/associate$ ls
_fpgrowth.cpython-34m.so fpgrowth.py _fpgrowth.pyx init.py pycache widgets

import Orange
data = Orange.data.Table("basket.tab")
rules = Orange.associate.AssociationRulesSparseInducer(data, support=0.3)
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'module' object has no attribute 'associate'

Feature Request

Hi.

In the orange3-associate python module, can we have a max-length parameter for frequent_itemsets function which will limit the length of itemsets generated?

Example
{1,2} is an itemset of length 2
{1,2,3} is an itemset of length 3

If I can limit the length of an itemset so that itemsets above a threshold aren't even generated, it will be really awesome. Currently, I am working on big data and from what I see, itemsets of all lengths are generated, which takes a large chunk of memory and time if support is less.

[FR] Save the tree

A function which returns the fp_tree would be very nice (to print it out).

owassociate: filtering chrashes

Associate version

master

Orange version

master

Expected behavior

I get no exceptions if I filter rules by string.

Actual behavior

When I try filter by a string not present in the dataset I get

Traceback (most recent call last):
  File "/home/marko/dev/orange-widget-base/orangewidget/gui.py", line 1709, in do_commit
    commit()
  File "/home/marko/dev/orange3-associate/orangecontrib/associate/widgets/owassociate.py", line 395, in find_rules
    self.table_rules = proxy_model.get_data()
  File "/home/marko/dev/orange3-associate/orangecontrib/associate/widgets/owassociate.py", line 270, in get_data
    table = Table.from_numpy(domain, X=data[:, :len(numeric)].astype(float),
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
Steps to reproduce the behavior

Run the widget with zoo and try to filter by "marko".

Additional info (worksheets, data, screenshots, ...)

Error when initilizing the application (non blocking)

What's wrong?
Upon starting up the application using python -m Orange.canvas it is showing the following error:

2023-02-10 10:09:41,193:ERROR:orangecanvas.registry.discovery: An ImportError occurred while loading entry point 'Associate = orangecontrib.associate.widgets'
Traceback (most recent call last):
  File "/home/jorge/.local/lib/python3.10/site-packages/orangecanvas/registry/discovery.py", line 122, in run
    point = entry_point.resolve()
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2471, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/home/jorge/.local/lib/python3.10/site-packages/orangecontrib/associate/__init__.py", line 1, in <module>
    from .fpgrowth import \
  File "/home/jorge/.local/lib/python3.10/site-packages/orangecontrib/associate/fpgrowth.py", line 152, in <module>
    from collections import defaultdict, Iterator
ImportError: cannot import name 'Iterator' from 'collections' (/usr/lib/python3.10/collections/__init__.py)

It happens everytime the application starts. Note that the application still runs fine and works, this error is just being logged in the console.

How can we reproduce the problem?
Simply opening the application with the python command will throw this error

What's your environment?

  • Operating system: Ubuntu 22.04.1 LTS, jammy
  • Orange version: 3.34.1
  • How you installed Orange: Using pip pip install orange3
  • I followed the instructions on the download section of you website. Also installed the packages PyQt5 PyQtWebEngine
  • Python version: Python 3.10.6
  • pip version: pip 23.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.