knewton / edm2016 Goto Github PK
View Code? Open in Web Editor NEWCode for replicating results in our EDM2016 paper
License: Apache License 2.0
Code for replicating results in our EDM2016 paper
License: Apache License 2.0
Hi, I read this paper a few weeks ago, I want to implement IRT, I read several paper but still don't know how to implement it. So I download this project to see how the code were organized. First I wanted to reproduce the result, but when I run
rnn_prof irt assistments skill_builder_data_big.txt --onepo \ --drop-duplicates --no-remove-skill-nans --num-folds 5 \ --item-id-col problem_id --concept-id-col single
The system warned that "rnn_prof is not an internal or external command". So I want to know what mistake I have made, and how can I reproduce this result.
Hi there,
I'm very interested in implementing IRT; this repo seems to be well-suited to what I am to achieve. However, i find it very difficult to play around with the code when I have to continuously run the command tox after any changes. What would be the best approach to achieve the same results WITHOUT using tox? What files should I individually run?
I also hope to implement this in a web application in the future, so knowing which files to run (without tox) would help as well.
Thanks.
Hi I read your paper Back to the Basics: Bayesian extensions of IRT outperform neural networks for proficiency estimation, and really want to try TIRT on my dataset, but I can't find the code in this repo. Is that possible to add it to this repo?or is it confidential? Thanks!
Hi!
I'm trying to reproduce the results of the EDM2016 paper to see if HIRT is viable (rapid enough) for real time computation, and it seems that I have some issues with the code (I didn't modify it). When I try to run HIRT with the Bridge to Algebra dataset, I have the following error:
...
File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/cli.py", line 163, in irt
data, _, _, _, _ = load_data(data_file, source, data_opts)
File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/data/wrapper.py", line 77, in load_data
min_interactions_per_user=data_opts.min_interactions_per_user)
File "/path/edm2016/.tox/py27/lib/python2.7/site-packages/rnn_prof/data/kddcup.py", line 99, in load_data
data = data.sort(sort_keys)
...
KeyError: u'problem_id'
I don't know why this happens, any help would be appreciated!
Thanks
I am just wondering what are the format requirement of Problem Id
and Step Name
column. I have tried to reuse the code for a research I am doing, but there would always be a key error when I tried to run the code on my data, unless I use the exact same problem id
as the one in the KDD Cup dataset.
Any help will be appreciated.
Hi I'm trying to get the parameters of each item in the HIRT model. I found these two functions in OnePOLearner. But they are actually the same and I'm wondering what's the meaning of offset coefficient.
def get_difficulty(self, item_ids):
""" Get the difficulties (in standard 1PO units) of an item or a set of items.
:param item_ids: ids of the requested items
:return: the difficulties (-offset_coeff)
:rtype: np.ndarray
"""
return -self.nodes[OFFSET_COEFFS_KEY].get_data_by_id(item_ids)
def get_offset_coeff(self, item_ids):
""" Get the offset coefficient of an item or a set of items.
:param item_ids: ids of the requested items
:return: the offset coefficient(s)
:rtype: np.ndarray
"""
return self.nodes[OFFSET_COEFFS_KEY].get_data_by_id(item_ids)
We're working to build a learner model in our research project, and we'd love to be able to use your code base as a starting point. I am looking to obtain prediction result for each individual instead of an aggregate AUC. Can you point me towards where to look in the code?
Thanks in advance!
Regarding the paper titled “Back to the basics: Bayesian extensions of IRT outperform neural networks for proficiency estimation”, I am interested in the online prediction accuracy metric of evaluation.
Couple questions (in relation to the 1PL IRT model):
In such a situation, is an IRT model unsuitable? Must the IRT model have initial data to work with, before making predictions; or can the model be continuously trained from the start? If so, what would be the default parameters to start with?
Thanks so much for your time, looking forward to your response.
I must be reading the code wrong in someway, but I'm running into an issue where the max_inter variables drops all values in the data frame. When I look at the common options I see the default set at 0 which seems to be the origin of this issue.
common_options.add('--max-inter', '-m', type=int, default=0, help="Maximum interactions per user", extra_callback=lambda ctx, param, value: value or None)
What are your thoughts on training a model with pre-existing data? For instance, DKT/logistic models do not require any student/item specific parameters, hence, the model could be trained with data collected elsewhere, and then implemented in an application to make immediate, accurate predictions.
For a model that does not require item or student parameters, would this be appropriate? What are the benefits of using data from the same students/items to train general weights? A model trained on dataset A could then be tested on datasets B and C, just like a real flashcard scenario; what are your thoughts on this?
When trying to evaluate an IRT model through online prediction accuracy, after determining all item parameters, is the ability parameter updated through retraining the model with ALL the data collected thus far (all the students + training data), or just the students’ INDIVIDUAL data? In other words, what data is used to train the student-level parameters?
Thanks again, looking forward to your response.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.