Giter Site home page Giter Site logo

Comments (6)

chihming avatar chihming commented on September 2, 2024 1

Please remove the first line in ratings_small.csv, and use the same command. You will get

2.5 0:1 1:1
4.0 2:1 3:1
5.0 2:1 4:1
5.0 2:1 5:1
4.0 2:1 6:1
4.0 2:1 7:1
3.0 2:1 8:1
3.0 2:1 9:1
4.0 2:1 10:1
3.0 2:1 11:1
5.0 2:1 12:1
3.0 13:1 14:1
4.0 13:1 10:1
3.5 13:1 15:1
3.0 13:1 16:1
2.5 13:1 17:1

In this case,
the feature index 0 represents userId 1,
the feature index 1 represents movieId 31,
the feature index 2 represents userId 2,
the feature index 3 represents movieId 10,
and so on.

from libfm.

chihming avatar chihming commented on September 2, 2024

You can find the data format from Section 2 of libFM 1.4.2 manual.

from libfm.

ChandraLingam avatar ChandraLingam commented on September 2, 2024

Thank you. yes, I did review the manual and was attempting to use the perl script for csv to libfm conversion

I created a small csv file using 16 rows from movielens ratings dataset and the script produced ratings_small.csv.libfm. Output does not seem to match the input (or at-least I not able to interpret what the script did)

triple_format_to_libfm.pl -in ratings_small.csv -target 2 -delete_column 3 -separator ","

transforming file ratings_small.csv to ratings_small.csv.libfm...
userId,movieId,rating,timestamp
1,31,2.5,1260759144
2,10,4.0,835355493
2,17,5.0,835355681
2,39,5.0,835355604
2,47,4.0,835355552
2,50,4.0,835355586
2,52,3.0,835356031
2,62,3.0,835355749
2,110,4.0,835355532
2,144,3.0,835356016
2,150,5.0,835355395
3,60,3.0,1298861675
3,110,4.0,1298922049
3,247,3.5,1298861637
3,267,3.0,1298861761
3,7153,2.5,1298921787
rating 0:1 1:1
2.5 2:1 3:1
4.0 4:1 5:1
5.0 4:1 6:1
5.0 4:1 7:1
4.0 4:1 8:1
4.0 4:1 9:1
3.0 4:1 10:1
3.0 4:1 11:1
4.0 4:1 12:1
3.0 4:1 13:1
5.0 4:1 14:1
3.0 15:1 16:1
4.0 15:1 12:1
3.5 15:1 17:1
3.0 15:1 18:1
2.5 15:1 19:1

from libfm.

ChandraLingam avatar ChandraLingam commented on September 2, 2024

Thank you very much. One more follow up question. Does this script also handle real valued features?
I added another feature at the end with random values. It appears that the script is doing a one hot encoding of this column as-well. Is there a way to preserve the real-valued features as-is?

1,31,2.5,1260759144,0.074345836
2,31,4,835355493,0.428518244
2,10,4,835355493,0.144215787
2,17,5,835355681,0.018740053
2,39,5,835355604,0.793609723
2,47,4,835355552,0.62908026
2,50,4,835355586,0.923838115
2,52,3,835356031,0.920521599
2,62,3,835355749,0.549236466
2,110,4,835355532,0.648895353
2,144,3,835356016,0.697152954
2,150,5,835355395,0.752723242
3,60,3,1298861675,0.803889224
3,110,4,1298922049,0.815850633
3,150,4,835355493,0.08505613
3,247,3.5,1298861637,0.268696775
3,267,3,1298861761,0.235652997
3,7153,2.5,1298921787,0.433312402

Output

2.5 0:1 1:1 2:1
4 3:1 1:1 4:1
4 3:1 5:1 6:1
5 3:1 7:1 8:1
5 3:1 9:1 10:1
4 3:1 11:1 12:1
4 3:1 13:1 14:1
3 3:1 15:1 16:1
3 3:1 17:1 18:1
4 3:1 19:1 20:1
3 3:1 21:1 22:1
5 3:1 23:1 24:1
3 25:1 26:1 27:1
4 25:1 19:1 28:1
4 25:1 23:1 29:1
3.5 25:1 30:1 31:1
3 25:1 32:1 33:1
2.5 25:1 34:1 35:1

from libfm.

chihming avatar chihming commented on September 2, 2024

I guess it doesn't support the real-valued features, so it will be better you write down your own transformation tool.

If you have no idea how to handle it. Maybe you can try this python code:
https://github.com/chihming/DataTransformer
and the instructions about how to convert the data to your required format:
https://github.com/chihming/DataTransformer/wiki/data2sparse
***Note that this project has been abandoned, but it still can meet your requirement.

from libfm.

ChandraLingam avatar ChandraLingam commented on September 2, 2024

Thank you for the prompt response/clarification. Appreciate it. I will close this issue for now

from libfm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.