Giter Site home page Giter Site logo

2016's People

Contributors

discoverychallenge avatar fabianabel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

2016's Issues

Problem about the rule & submission

Hi, we found a conflict on the page https://recsys.xing.com/rules , it says "the submission limits: you can upload at maximum 5 solutions per day"
And on the top of the submission page https://recsys.xing.com/submission , it also says "New Submission 5 submissions still possible for you today", however on the bottom of this page https://recsys.xing.com/submission , shows that "Additional remarks: No.4 Notice that you can only submit 3 solutions per calendar day".

So, how many solutions can i submit per calendar day actually?
And which Time Zone is the standard? GMT or another?

How could we do offline ranking's evaluation?

Hi,
I split the interaction data into training and test by leaving out last week's interactions.
but how could I know the relevantItem of each user in test data. how could I evaluate my algorithm offline by the method described in EvaluationMeasure.md?
Anyone can help me, Thanks

something about submission time

Hi friend, we submitted a result at 23:50(Hawaii time), while the result(score) showed off at 00:01, which could be found in the submission records.
Unfortunately, this leads to our result showing in the unofficial leaderboard rather than the official one.
We wonder would this result be considered in the official leaderboard?
Looking forward to your reply and thank you very much.
Team "Cine" again and the submission label is "final1".

Problem about Evaluation Measure

function userSuccess(recommendedItems, relevantItem) = {
if (intersect(recommendedItems, relevantItems) > 0)
1.0
else
0.0
}
should be changed to
function userSuccess(recommendedItems, relevantItem) = {
if (intersect(recommendedItems.take(30), relevantItems) > 0)
1.0
else
0.0
}

Private leaderboard evaluation

I'd like to have some clarifications about the private leaderboard evaluation.
Is it the best score reached ever, or the corresponding value of the public leaderboard score?

I'll provide an example to be more clear

submission_x has:
public_leaderboard: 100
private_leaderboard: 90

while submission_y has:
public_leaderboard: 95
private_leaderboard: 95

Clearly in the public leaderboard my team would score 100 (submission_x), but what for the private?
Will it be 90 or 95??

question on Metric : Precision at k. What if len(recommendation) < k

In the document, the pseudocode returns

topK = recommendedItems.take(k)
return intersect(topK, relevantItems).size / k

My question is, what if L = len(recommendedItems) < k?
Does the metric return intersect(topK, relevantItems).size / min(k, L)?

In other words, does the evaluation encourage submitting less items but accurate or more that seeking good recall?

Thanks!

Kuan

Multiple Teammates

Hi,
is there a possibility of multiple teammates in the submission system, who could download the data and/or submit results, without the necessity to share the same xing account?

edu_degree values not conforming to description

The description says that edu_degree values range from 0 to 3, but analyzing the dataset we found that the actual range is 0-6. Is it an error in the description? What's the actual meaning of those values?

about "creat_at"

how long does the "creat_at" contained in the interactions.csv file represent? a year?

"interact with in the next week"

Hi data people,

I didn't understand this "next week"

"the recommender should predict those job postings (items) that the user will interact with in the next week."

Thanks :)

Evaluation system

How long will the evaluation system be available? I know it's no longer good for the challenge, but if we would like to keep working on the problem for fun.

NULL in career level

I would like to know if the value NULL in career level means 0 (unknown)? I have found several NULL at least in users.csv

Thanks

Submission systems bug

There is a bug in the submission system, basically if you use too verbose labels the system returns a generic error saying that something went wrong.

I simply reduced the text of label and the same submission was accepted.

Please fix this or set a char limit in the textfield to prevent it :) .

Workshop proceedings

We will post the link to the proceedings here and close this issue as soon as the workshop proceedings are published by ACM.

Invalid user_ids

There are user_ids in impressions that don't appear in users.

For example, the first user_id in impressions is 1842650, which isn't a string that is not in users.
Is this intentional? According to a script I wrote, around 33% of user_ids in impressions are invalid.

How to measure the relevance between user and item in the next week?

Relevant items are those items on which a user clicked, bookmarked or replied (interaction_type= 1, 2 or 3).ย 

I want to build a local evaluation, But I has a question, how to measure it? Just by the value of the interaction_type, or related to the interaction create time, or related to the times the item interacted by the user.

Issue about participating the workshop

Dear organizer:
Our team will be eager and sure to participate this workshop in Boston if our accompanied paper is accepted.
So, there is a request that will you issue (send) us a (more) formal invitation letter if our accompanied paper is accepted? To be honest, it is really hard to apply the VISA only by right of the letter you sent on July 14th.
And our very last question is how many participants are allowed to register the workshop in one single team? Could all of the authors participate this workshop in September?

Thanks.

problems with data?

hi all,

we found some problems with training data:

  1. files are not in a real csv format. are they? we are not aware of csv formats that support fields which are lists

  2. the number of values in the users.csv rows varies. sometimes 12 values are present in a row and sometimes only 11 (counting lists as a single value, clearly). how should be interpreted this? which field value is actually missing (perhaps the last one?)

  3. there are many replicated users in the file users.csv. in particular, there are 1367057 unique users out of 1500000!

Thanks for your answer!

We have created a team

Excuse me, we have created a team named "Cine".
Hope it could be approved if convenient.

About the rule "Your algorithm should not attempt to identify artificial users, or reconstruct flipped values."

I would like to learn some details about this rule. As we do accept not to try to reconstruct your flipped data, we would like to make some assumptions and change some of the unknown values. Would this be a violation of rules or just like a different default values for different users ?

ex: (Lets say the experience in years value is 5, but the careerlevel is null. I would like to assume the careerlevel as 3 or 4)

Thanks

Question on the order of the test set

Since the score is related to the order of items in the test set, can we know the rule of the order? If we don't know that, I think the offline evaluation is ugly. THX.

Is there a software license restrictions?

Hello, I would like to clarify whether there are restrictions on software by which a solution was developed, for example only Open Source Initiative-approved license or only the software for non-commercial use?

CNAME

create a file called CNAME (in gh-pages) containing the line:
2016.recsyschallenge.com

Dataset undownloadable

Hi,

Is there something wrong with the server holding the data set? The download speed is very slow( 3~5kb/s) and for the 1GB training data set it's almost undownloadable (the whole downloading process always stops after several minutes). Moreover, since downloading the dataset requires signing in, I cannot use any download tool to help with this problem. I am from Canada and using the university's internet. Similar situation has never happened before. Can you look into it? Thank you!
slow-speed

Could we get one of the previous results submitted?

Hi friends, we have submitted several results till now.
But unfortunately we covered some results by mistake, including which got the best score ever. It takes too much time to run a new result to make improvement on it.
Are there any method to get the previous results submitted from your server?
Our team is "Cine" and the label of result we want to get is "lda cosine similarity".
Thanks all the same if it's impossible.

Question about rounding and click sequence in dataset

Hi,

in dataset description of the interactions we can read:

timestamp (Unix timestamp, rounded to 5 min)

Are You planing to do this rounding in a way that we will be able to get an actual click sequence in a session? I think it is important information information.

Regards,
B.Twardowski

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.