recsyschallenge / 2016 Goto Github PK
View Code? Open in Web Editor NEWRecSys Challenge 2016: job recommendations
Home Page: http://2016.recsyschallenge.com/
RecSys Challenge 2016: job recommendations
Home Page: http://2016.recsyschallenge.com/
Hi, we found a conflict on the page https://recsys.xing.com/rules , it says "the submission limits: you can upload at maximum 5 solutions per day"
And on the top of the submission page https://recsys.xing.com/submission , it also says "New Submission 5 submissions still possible for you today", however on the bottom of this page https://recsys.xing.com/submission , shows that "Additional remarks: No.4 Notice that you can only submit 3 solutions per calendar day".
So, how many solutions can i submit per calendar day actually?
And which Time Zone is the standard? GMT or another?
Hi,
I split the interaction data into training and test by leaving out last week's interactions.
but how could I know the relevantItem
of each user in test data. how could I evaluate my algorithm offline by the method described in EvaluationMeasure.md
?
Anyone can help me, Thanks
Hi friend, we submitted a result at 23:50(Hawaii time), while the result(score) showed off at 00:01, which could be found in the submission records.
Unfortunately, this leads to our result showing in the unofficial leaderboard rather than the official one.
We wonder would this result be considered in the official leaderboard?
Looking forward to your reply and thank you very much.
Team "Cine" again and the submission label is "final1".
function userSuccess(recommendedItems, relevantItem) = {
if (intersect(recommendedItems, relevantItems) > 0)
1.0
else
0.0
}
should be changed to
function userSuccess(recommendedItems, relevantItem) = {
if (intersect(recommendedItems.take(30), relevantItems) > 0)
1.0
else
0.0
}
I'd like to have some clarifications about the private leaderboard evaluation.
Is it the best score reached ever, or the corresponding value of the public leaderboard score?
I'll provide an example to be more clear
submission_x
has:
public_leaderboard: 100
private_leaderboard: 90
while submission_y
has:
public_leaderboard: 95
private_leaderboard: 95
Clearly in the public leaderboard my team would score 100
(submission_x
), but what for the private?
Will it be 90
or 95
??
In the document, the pseudocode returns
topK = recommendedItems.take(k)
return intersect(topK, relevantItems).size / k
My question is, what if L = len(recommendedItems) < k?
Does the metric return intersect(topK, relevantItems).size / min(k, L)?
In other words, does the evaluation encourage submitting less items but accurate or more that seeking good recall?
Thanks!
Kuan
Hi,
is there a possibility of multiple teammates in the submission system, who could download the data and/or submit results, without the necessity to share the same xing account?
Hi,
When will the dataset be available to use?
The description says that edu_degree values range from 0 to 3, but analyzing the dataset we found that the actual range is 0-6. Is it an error in the description? What's the actual meaning of those values?
how long does the "creat_at" contained in the interactions.csv file represent? a year?
Hi data people,
I didn't understand this "next week"
"the recommender should predict those job postings (items) that the user will interact with in the next week."
Thanks :)
Hi organizers , @fabianabel
May I ask the relation between impressions and interactions, whether these interactions came from impressions, or just collected from global site? Many thanks
It seems they are not. Just wondering if that is the expected behaviour : )
How long will the evaluation system be available? I know it's no longer good for the challenge, but if we would like to keep working on the problem for fun.
I've created the team for about a week ago, When it can be approved ?
thanks
I would like to know if the value NULL in career level means 0 (unknown)? I have found several NULL at least in users.csv
Thanks
There is a bug in the submission system, basically if you use too verbose labels the system returns a generic error saying that something went wrong.
I simply reduced the text of label and the same submission was accepted.
Please fix this or set a char limit in the textfield to prevent it :) .
I have a question, Does the accepted papers will be published, or just slides?
How long should I wait?
We will post the link to the proceedings here and close this issue as soon as the workshop proceedings are published by ACM.
The week begins with 0 or 1 in the year? And one week begins with Monday or Sunday in this comp.
There are user_ids in impressions that don't appear in users.
For example, the first user_id in impressions is 1842650, which isn't a string that is not in users.
Is this intentional? According to a script I wrote, around 33% of user_ids in impressions are invalid.
Relevant items are those items on which a user clicked, bookmarked or replied (interaction_type= 1, 2 or 3).ย
I want to build a local evaluation, But I has a question, how to measure it? Just by the value of the interaction_type, or related to the interaction create time, or related to the times the item interacted by the user.
Our team is waiting for approval for several days,why the process is so slow? -_-
Dear organizer:
Our team will be eager and sure to participate this workshop in Boston if our accompanied paper is accepted.
So, there is a request that will you issue (send) us a (more) formal invitation letter if our accompanied paper is accepted? To be honest, it is really hard to apply the VISA only by right of the letter you sent on July 14th.
And our very last question is how many participants are allowed to register the workshop in one single team? Could all of the authors participate this workshop in September?
Thanks.
hi all,
we found some problems with training data:
files are not in a real csv format. are they? we are not aware of csv formats that support fields which are lists
the number of values in the users.csv rows varies. sometimes 12 values are present in a row and sometimes only 11 (counting lists as a single value, clearly). how should be interpreted this? which field value is actually missing (perhaps the last one?)
there are many replicated users in the file users.csv. in particular, there are 1367057 unique users out of 1500000!
Thanks for your answer!
Excuse me, we have created a team named "Cine".
Hope it could be approved if convenient.
I would like to learn some details about this rule. As we do accept not to try to reconstruct your flipped data, we would like to make some assumptions and change some of the unknown values. Would this be a violation of rules or just like a different default values for different users ?
ex: (Lets say the experience in years value is 5, but the careerlevel is null. I would like to assume the careerlevel as 3 or 4)
Thanks
What is the exact date for submitting solutions?
Since the score is related to the order of items in the test set, can we know the rule of the order? If we don't know that, I think the offline evaluation is ugly. THX.
I want to get the dataset and create a team as the instructions illustrated, but no one ever approved my team.
rt
Many records in "Impressions.csv" show as following:
user_id year week items
1842650 2015 41 1386585,1386585,2139076,766293,977414,1133414,1163212
Could you tell me that the meaning of the item order & repetition in "Impressions.csv"?
(If the rules allow it)
thank you ^o^
We are confused with the attribute "created_at" in items.csv. What dose it mean? Why the description is the same as that of interactions.csv? Is't got wrong?
Hello, I would like to clarify whether there are restrictions on software by which a solution was developed, for example only Open Source Initiative-approved license or only the software for non-commercial use?
create a file called CNAME (in gh-pages) containing the line:
2016.recsyschallenge.com
Hi,
Is there something wrong with the server holding the data set? The download speed is very slow( 3~5kb/s) and for the 1GB training data set it's almost undownloadable (the whole downloading process always stops after several minutes). Moreover, since downloading the dataset requires signing in, I cannot use any download tool to help with this problem. I am from Canada and using the university's internet. Similar situation has never happened before. Can you look into it? Thank you!
When will the data-set be available ??
Hi friends, we have submitted several results till now.
But unfortunately we covered some results by mistake, including which got the best score ever. It takes too much time to run a new result to make improvement on it.
Are there any method to get the previous results submitted from your server?
Our team is "Cine" and the label of result we want to get is "lda cosine similarity".
Thanks all the same if it's impossible.
Hi,
in dataset description of the interactions we can read:
timestamp (Unix timestamp, rounded to 5 min)
Are You planing to do this rounding in a way that we will be able to get an actual click sequence in a session? I think it is important information information.
Regards,
B.Twardowski
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.