Giter Site home page Giter Site logo

neuroimaginador / fcar Goto Github PK

View Code? Open in Web Editor NEW
14.0 4.0 6.0 6.17 MB

Tools for Formal Concept Analysis

Home Page: https://neuroimaginador.github.io/fcaR

License: GNU General Public License v3.0

R 73.23% C++ 26.77%
implications formal-concept-analysis concept-extraction logic recommendation-engine formal-contexts concept-lattice fuzzy-sets arules

fcar's Introduction

fcaR: Tools for Formal Concept Analysis

Lifecycle: stable CRAN status codecov R build status Downloads

The aim of this package is to provide tools to perform fuzzy formal concept analysis (FCA) from within R. It provides functions to load and save a Formal Context, extract its concept lattice and implications. In addition, one can use the implications to compute semantic closures of fuzzy sets and, thus, build recommendation systems.

Details

The fcaR package provides data structures which allow the user to work seamlessly with formal contexts and sets of implications. More explicitly, three main classes are implemented, using the object-oriented-programming paradigm in R:

  • FormalContext encapsulates the definition of a formal context (G, M, I), being G the set of objects, M the set of attributes and I the (fuzzy) relationship matrix, and provides methods to operate on the context using FCA tools.
  • ImplicationSet represents a set of implications over a specific formal context.
  • ConceptLattice represents the set of concepts and their relationships, including methods to operate on the lattice.

Two additional helper classes are implemented:

  • Set is a class solely used for visualization purposes, since it encapsulates in sparse format a (fuzzy) set.
  • Concept encapsulates internally both extent and intent of a formal concept as Set.

Since fcaR is an extension of the data model in the arules package, most of the methods and classes implemented interoperates with the main S4 classes in arules (transactions and rules).

Installation

This package is available at CRAN, so its stable version can be easily installed using:

install.packages("fcaR")

The development version of this package can be installed with

remotes::install_github("neuroimaginador/fcaR", build_vignettes = TRUE)

or

remotes::install_github("Malaga-FCA-group/fcaR", build_vignettes = TRUE)

Example of Use

Let us start with a fuzzy dataset (stored in a matrix I) as follows:

P1 P2 P3 P4 P5 P6
O1 0.0 1.0 0.5 0.5 1.0 0
O2 1.0 1.0 1.0 0.0 0.0 0
O3 0.5 0.5 0.0 0.0 0.0 1
O4 0.0 0.0 0.0 1.0 0.5 0
O5 0.0 0.0 1.0 0.5 0.0 0
O6 0.5 0.0 0.0 0.0 0.0 0

Here, a value $x$ in the intersection of a row and a column indicates that the object of the corresponding row possesses the attribute in the column in a degree of at least $x$ (if $x = 0$, the attribute is absent in the object, and if $x = 1$, the attribute is fully present in the object).

We can build a FormalContext object:

fc <- FormalContext$new(I)

print(fc)
#> FormalContext with 6 objects and 6 attributes.
#>      P1   P2   P3   P4   P5  P6  
#>  O1  0    1   0.5  0.5   1    0  
#>  O2  1    1    1    0    0    0  
#>  O3 0.5  0.5   0    0    0    1  
#>  O4  0    0    0    1   0.5   0  
#>  O5  0    0    1   0.5   0    0  
#>  O6 0.5   0    0    0    0    0

With a single function, we can compute the set of concepts:

# Compute all concepts
fc$find_concepts()

# The first concept
fc$concepts$sub(1)
#> ({O1, O2, O3, O4, O5, O6}, {})

# And plot the concept lattice
fc$concepts$plot()

We can also extract implications from the formal context:

# Extract implications
fc$find_implications()

# Which implications have been extracted
fc$implications
#> Implication set with 12 implications.
#> Rule 1: {P6 [0.5]} -> {P1 [0.5], P2 [0.5], P6}
#> Rule 2: {P5 [0.5]} -> {P4 [0.5]}
#> Rule 3: {P3 [0.5], P4 [0.5], P5 [0.5]} -> {P2, P5}
#> Rule 4: {P3 [0.5], P4} -> {P3}
#> Rule 5: {P2 [0.5], P4 [0.5]} -> {P2, P3 [0.5], P5}
#> Rule 6: {P2 [0.5], P3 [0.5]} -> {P2}
#> Rule 7: {P2, P3, P4 [0.5], P5} -> {P4}
#> Rule 8: {P1 [0.5], P4 [0.5]} -> {P1, P2, P3, P4, P5, P6}
#> Rule 9: {P1 [0.5], P3 [0.5]} -> {P1, P2, P3}
#> Rule 10: {P1 [0.5], P2} -> {P1}
#> Rule 11: {P1, P2 [0.5]} -> {P2}
#> Rule 12: {P1, P2, P3, P6} -> {P4, P5}

Some fundamental functionalities on the concept lattice associated to the formal context have been implemented:

  • Computing a sublattice.
  • Calculating the subconcepts and superconcepts of a given concept.
  • Finding the join- and meet- irreducible elements, which allows to reduce the context and find the standard context.

Also, one can compute the support of both implications and concepts:

fc$implications$support()
#>  [1] 0.1666667 0.3333333 0.1666667 0.0000000 0.1666667 0.3333333 0.0000000
#>  [8] 0.0000000 0.1666667 0.1666667 0.1666667 0.0000000
fc$concepts$support()
#>  [1] 1.0000000 0.5000000 0.3333333 0.1666667 0.1666667 0.1666667 0.0000000
#>  [8] 0.5000000 0.3333333 0.3333333 0.1666667 0.0000000 0.5000000 0.3333333
#> [15] 0.3333333 0.1666667 0.1666667 0.0000000 0.5000000 0.3333333 0.1666667
#> [22] 0.1666667 0.1666667 0.0000000 0.1666667 0.0000000

In this package, we have implemented a logic to manage implications. This so-called Simplification Logic allows us to simplify the extracted rules by removing redundancies, as well as computing the closure of a given fuzzy attribute set.

# Reduce the number of implications using two simple
# rules. The algorithm applies the specified rules
# in batches, if the number of rules is high.
fc$implications$apply_rules(rules = c("composition",
                                      "generalization"))
#> Processing batch
#> --> Composition: from 12 to 12 in 0.001 secs.
#> --> Generalization: from 12 to 12 in 0.001 secs.
#> Batch took 0.004 secs.

# Reduced set of implications
fc$implications
#> Implication set with 12 implications.
#> Rule 1: {P6 [0.5]} -> {P1 [0.5], P2 [0.5], P6}
#> Rule 2: {P5 [0.5]} -> {P4 [0.5]}
#> Rule 3: {P3 [0.5], P4 [0.5], P5 [0.5]} -> {P2, P5}
#> Rule 4: {P3 [0.5], P4} -> {P3}
#> Rule 5: {P2 [0.5], P4 [0.5]} -> {P2, P3 [0.5], P5}
#> Rule 6: {P2 [0.5], P3 [0.5]} -> {P2}
#> Rule 7: {P2, P3, P4 [0.5], P5} -> {P4}
#> Rule 8: {P1 [0.5], P4 [0.5]} -> {P1, P2, P3, P4, P5, P6}
#> Rule 9: {P1 [0.5], P3 [0.5]} -> {P1, P2, P3}
#> Rule 10: {P1 [0.5], P2} -> {P1}
#> Rule 11: {P1, P2 [0.5]} -> {P2}
#> Rule 12: {P1, P2, P3, P6} -> {P4, P5}

All these functions work natively with fuzzy and with binary datasets.

For more details on the methods implemented and further examples, see the vignettes in this package.

Changelog

With respect to the CRAN version, the development version has the following changes.

fcaR 1.2.2

Enhancements:

  • Added more unit tests.
  • Minor changes to the plotting of formal contexts.
  • Now the fc$scale() function admits a new argument bg (default: FALSE) which, if set to TRUE, avoids computing the background knowledge of the scales.

fcaR 1.2.1

Enhancements:

  • Other logics have been implemented. Now, we can use fc$use_logic() to select one of the available_logics().
  • Improved export to LaTeX.

Bugfixes:

  • Some rounding errors might induce errors in the computations. These has been fixed.

fcaR 1.2.0

Bugfixes:

  • Fixes required by the new version of Matrix and the new use of HTML Tidy in R 4.2.

fcaR 1.1.1

Enhancements:

  • The user can control the number of decimal digits when exporting to LaTeX or when printing formal contexts, concept lattices and implications. Just use fcaR_options(decimal_places = n), where n is the number of desired decimal digits.

New functionality:

  • Now the package uses the settings package to manage several options. Currently, the only option is the number of decimal digits to use when printing or exporting to LaTeX.

Bugfixes:

  • Fixed exporting to latex with special characters such as $, _, etc.

References

Guigues J, Duquenne V (1986). “Familles minimales d’implications informatives résultant d’un tableau de données binaires.” Mathématiques et Sciences humaines, 95, 5-18.

Ganter B, Wille R (1999). Formal concept analysis : mathematical foundations. Springer. ISBN 3540627715.

Cordero P, Enciso M, Mora Á, Pérez de Guzman I (2002). “SLFD Logic: Elimination of Data Redundancy in Knowledge Representation.” Advances in Artificial Intelligence - IBERAMIA 2002, 2527, 141-150. doi: 10.1007/3-540-36131-6_15 (URL: https://doi.org/10.1007/3-540-36131-6_15).

Belohlavek R (2002). “Algorithms for fuzzy concept lattices.” In Proc. Fourth Int. Conf. on Recent Advances in Soft Computing. Nottingham, United Kingdom, 200-205.

Hahsler M, Grun B, Hornik K (2005). “arules - a computational environment for mining association rules and frequent item sets.” J Stat Softw, 14, 1-25.

Mora A, Cordero P, Enciso M, Fortes I, Aguilera G (2012). “Closure via functional dependence simplification.” International Journal of Computer Mathematics, 89(4), 510-526.

Belohlavek R, Cordero P, Enciso M, Mora Á, Vychodil V (2016). “Automated prover for attribute dependencies in data with grades.” International Journal of Approximate Reasoning, 70, 51-67.

fcar's People

Contributors

amorabonilla avatar neuroimaginador avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fcar's Issues

Warnings for tests with upcoming arules_1.6-6

Testing with the new version of rules (arules_1.6-6) produces the following warning:

 test-implication_set.R:97: warning: fcaR exports implications to arules
minimum support or confidence not available in info(x). Using uncorrected std_lift instead.

The issue is that the new measure stdLift needs information on the used thresholds for mining the rules. These are stored in info(x)$support and info(x)$confidence, but are missing in your code. More measures in the future might need such information as well.

The options are to

  • add the information to your rules object or to
  • specify only a select set of interestMeasures when you use fc$implications$to_arules(quality = TRUE).

Regards,
-Michael

Export of useful functions?

Some functions that are not exported currently, such as the computation of intents and extents of a given SparseSet inside a FormalContext, may be useful for the end-user.

Thus, maybe adding FormalContext$intent(S), $extent(S), $closure(S) would help others to extend the functionality of fcaR.

closure() creates ImplicationSet which cannot be converted using to_arules()

Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/ or https://community.rstudio.com/.

Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.


Brief description of the problem
In fcaR v1.0.3, the ImplicationSet created by closure() sets private$I to NULL, so that the set cannot be converted using to_arules(). I am using RStudio Version 1.2.5033.

library("fcaR")
library("arules")
fc_planets <- FormalContext$new(planets)
fc_planets$find_implications()
S <- SparseSet$new(attributes = fc_planets$attributes)
fc_planets$implications$closure(S,reduce=TRUE,verbose=TRUE)$implications$to_arules(quality=TRUE)

Error with implications with cardinality very small

The error is the following

________________ Recommendation is: _______

Objects in the recommendation global:
98
Closure in this iteration:
att24 att29 att35 att43

Next attributes to explore in next iteration :
att6 att10 att34
Rules in next iteration:
[1] 1

Error in storage.mode(from) <- "double" :
el objeto (list) no puede ser coercionado a 'double'

Error - cuando trabaja en paralelo

----------------------------------------Experiment: 13
Objects: 30
Attributes: 90
Apriori

Parameter specification:
confidence minval smax arem aval originalSupport
1 0.1 1 none FALSE TRUE
maxtime support minlen maxlen target ext
5 0.11 2 10 rules FALSE

Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE

Absolute minimum support count: 3

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[90 item(s), 30 transaction(s)] done [0.00s].
sorting and recoding items ... [90 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 5 6 7 8 9 10 done [5.77s].
writing ... [18725864 rule(s)] done [18.80s].
creating S4 object ... done [10.10s].
Using parallel execution
Called from: parallel::mclapply(x, FUN, mc.cores = parallel::detectCores())

source('~/Cooperative-Filtering/Ejemplos-pruebas/prueba_random_n_experiments.R')
Error in parallel::mclapply(x, FUN, mc.cores = parallel::detectCores()) :
'mc.cores' must be >= 1

En Traceback la última llamada fue a
paralello: mclapply(x, FUNC, mc.cores=paralell(detectcores....

Warning message: multiple methods tables found for ‘plot’

Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/ or https://community.rstudio.com/.

Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.


Brief description of the problem
fc$concepts$plot() opens up Quartz window and then crashes R.

R Under development (unstable) (2020-03-08 r77917) -- "Unsuffered Consequences"
Platform: x86_64-apple-darwin15.6.0 (64-bit)
[R.app GUI 1.70 (7782) x86_64-apple-darwin15.6.0]

> library("fcaR")
Warning message:
multiple methods tables found forplot> I <- read.csv(file = 'I.csv', header = FALSE)
> rowNames <- read.csv(file = 'rowNames.csv', header = FALSE)
> colNames <- read.csv(file = 'colNames.csv', header = FALSE)

> colnames(I) <- t(colNames)
> rownames(I) <- t(rowNames)

> library("fcaR")
> fc <- FormalContext$new(I)
> fc$clarify()
> fc$reduce()
> fc$find_concepts()
> fc$concepts$plot()

Bug when printing implication sets with no implications

When trying to print a recently created ImplicationSet, such as in:

imps <- implication_set$new(attributes = paste0("P", 1:5))
imps

it produces the following error:

Implication set with implications.
Error in if (n_implications > 0) ...

Maybe a check before printing? Or may it be another problem?

[Enhancement] Provide proper names for classes

It seems that good coding practice in R6 classes includes naming the class constructor with the class name.

In this case, it should be "FormalContext", not "formal_context", and so on...

new error

I describe the environment in where I have an error:


---------------------------------------Experiment: 31
Objects: 90
Attributes: 60
Apriori

Parameter specification:
confidence minval smax arem aval
1 0.1 1 none FALSE
originalSupport maxtime support minlen
TRUE 5 0.1 2
maxlen target ext
10 rules FALSE

Algorithmic control:
filter tree heap memopt load sort
0.1 TRUE TRUE FALSE TRUE 2
verbose
TRUE

Absolute minimum support count: 9

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[60 item(s), 90 transaction(s)] done [0.00s].
sorting and recoding items ... [60 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 5 6 7 done [0.10s].
writing ... [6482 rule(s)] done [0.01s].
creating S4 object ... done [0.01s].
Using parallel execution
Processing batch
--> composition : from 4553 to 4406 in 0.27 secs.
Batch took 0.272 secs.
Using parallel execution
Processing batch
--> composition : from 3679 to 3521 in 0.212 secs.
Batch took 0.214 secs.
_________________ Recommendation is: _______

Objects in the recommendation global:
5 7 15 17 27 41 56 59 65
Closure in this iteration:
att14 att27 att44 att49 att53

Next attributes to explore in next iteration :
att1 att2 att3 att4 att5 att6 att7 att8 att9 att10 att11 att12 att13 att15 att16 att17 att18 att19 att20 att21 att22 att23 att24 att25 att26 att28 att29 att30 att31 att32 att33 att34 att35 att36 att37 att38 att39 att40 att41 att42 att43 att45 att46 att47 att48 att50 att51 att52 att54 att55 att56 att57 att58 att59 att60
Rules in next iteration:
[1] 3521

Using parallel execution
Processing batch
--> composition : from 3282 to 3101 in 0.24 secs.
Batch took 0.241 secs.
_________________ Recommendation is: _______

Objects in the recommendation global:
5 7 65
Closure in this iteration:
att10 att12 att31

Next attributes to explore in next iteration :
att1 att2 att3 att4 att5 att6 att7 att8 att9 att11 att13 att15 att16 att17 att18 att19 att20 att21 att22 att23 att24 att25 att26 att28 att29 att30 att32 att33 att34 att35 att36 att37 att38 att39 att40 att41 att42 att43 att45 att46 att47 att48 att50 att51 att52 att54 att55 att56 att57 att58 att59 att60
Rules in next iteration:
[1] 3101

Using parallel execution
Processing batch
--> composition : from 1251 to 400 in 0.226 secs.
Batch took 0.227 secs.
_________________ Recommendation is: _______

Objects in the recommendation global:
5
Closure in this iteration:
att7 att9 att13 att16 att18 att19 att23 att24 att25 att26 att29 att30 att35 att36 att37 att38 att40 att46 att48 att50 att51 att52 att54 att56

Next attributes to explore in next iteration :
att1 att3 att4 att5 att6 att8 att15 att17 att20 att21 att22 att28 att32 att33 att34 att39 att41 att43 att45 att47 att55 att57 att58 att59 att60
Rules in next iteration:
[1] 400

Using parallel execution
Error in if (isSym) "s" else "g" : argumento tiene longitud cero
Called from: paste0(if (is.numeric(data)) "d" else if (is.logical(data)) "l" else stop("invalid 'data'"),
if (isSym) "s" else "g", "CMatrix")

Missing dependencies on CRAN

I tried to install fcaR from CRAN:

> install.packages("fcaR")
Installing package into/home/tobias/R/x86_64-pc-linux-gnu-library/4.0’
(aslibis unspecified)
Warning in install.packages :
  dependenciesRgraphviz’, ‘graphare not available
also installing the dependencyhasseDiagramversuche URL 'https://cloud.r-project.org/src/contrib/hasseDiagram_0.1.3.tar.gz'
Content type 'application/x-gzip' length 5856 bytes
==================================================
downloaded 5856 bytes

versuche URL 'https://cloud.r-project.org/src/contrib/fcaR_1.0.3.tar.gz'
Content type 'application/x-gzip' length 533920 bytes (521 KB)
==================================================
downloaded 521 KB

ERROR: dependenciesRgraphviz’, ‘graphare not available for packagehasseDiagram* removing/home/tobias/R/x86_64-pc-linux-gnu-library/4.0/hasseDiagramWarning in install.packages :
  installation of packagehasseDiagramhad non-zero exit status
ERROR: dependencyhasseDiagramis not available for packagefcaR* removing/home/tobias/R/x86_64-pc-linux-gnu-library/4.0/fcaRWarning in install.packages :
  installation of packagefcaRhad non-zero exit status

The downloaded source packages are in/tmp/RtmppTUXyB/downloaded_packages

So some packgages needed seem not to be available.

Slow saving in RDS format

When trying to save in RDS format an object of type FormalContext, it is too slow. It seems that it's caused by the format of the list of concepts.

Maybe it can be stored in another format? Or provide a specific save / load function?

Error in dimnamesGets(x, value) :

Processing batch

--> composition: from 5 to 5 in 0.002 secs.

Batch took 0.006 secs.

Error in dimnamesGets(x, value) :
invalid dimnames given for “dgCMatrix” object

I have shared with you the repository Cooperative Filtering.

Execute the file prueba_random_n_experiments.R

It is a recursive algorithm, the number of implications is decreasing and I add the new implications to the formal context. In this point it appears the error.

error in composition?

Al ejecutar me ha dado este error:

Using parallel execution
Error in if (isSym) "s" else "g" : argumento tiene longitud cero
Called from: paste0(if (is.numeric(data)) "d" else if (is.logical(data)) "l" else stop("invalid 'data'"),
if (isSym) "s" else "g", "CMatrix")

En la traza veo que viene de

fiar_fc$implications$apply_rules(rules=c("composition))

creo tiene que ver cuando las implicaciones se anulan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.