Deprecated labels matching
by ryazanoff - Monday, May 8, 2017, 13:13:01


I tried to enlarge training set by adding deprecated test data. I supposed that l need to concatenate chunk5,6,and 7 in this order and fill decision with "deprecated test labels"

Then I tried cross-validation only on old train set and only on the new set (deprecated)

In first case I got ~ 0.79 roc auc, In second ~ 0.59 which means complete random in labels.

So my question is - in which order shall one match deprecated labels? The best way would be obtaining file with pairs: gamestate_id - decision for deprecated data.


Thanks in advance!

Re: Deprecated labels matching
by janek - Tuesday, May 09, 2017, 13:42:17

Dear Vasily,

the way in which you tried to add the labels is correct, so it is difficult for me to say what went wrong. Maybe it is a matter of some bug in your code?


Re: Deprecated labels matching
by ryazanoff - Tuesday, May 09, 2017, 13:55:38

Thank you for the answer!

I doubt in this - the only thing I changed during this experiment is the data on which I perform CV (from test to deprecated). The model and it's parameters remained the same.

The value ~0.59 in roc auc means complete random of labels (for example I tried to load deprecated chunks in order 6,5,7 and got the same score ~ 0.59)

So I think that labeling for deprecated data isn't correct. Though I can be mistaken, I'd like to hear some other opininons from participants who tried to use deprecated data)


Thank you for your answer!

Re: Deprecated labels matching
by hieuvq - Wednesday, May 10, 2017, 04:12:39

Hi Vasily,

Before the new test set was released, I had done an evaluation on the deprecated test set and got a similar score to the one I had had in the previous public board. It means that the released test set labels are correct. I don't see anything wrong in your processing step. As Andrzej suggested, you may want to double-check the code.



Re: Deprecated labels matching
by janek - Wednesday, May 10, 2017, 13:30:42

Dear Vasily,

the labeling for the deprecated test data set is correct.

The score that you received (~0.59 AUC) is far from random result. For a data set with 1,250,000 records and a balanced distribution of labels, there is an extremely low probability of getting AUC that high only by a chance. In fact, for such a set, the probability of getting AUC higher than 0.501 by assigning random scores is lower than 5%. Thus, I would recommend double-checking your code :-) It is also possible that such a huge difference in results might be related to some properties of the model that you are using...