metric code
by Karo56 - Monday, March 11, 2024, 21:41:10

Good morning,

I have a feeling that the results on the leaderboard are behaving rather strangely. Is it possible to publish the code that calculates the metric on leaderboard? (similar to other data science competitions) It's not a complicated metric and it's easy to implement, but still, it's not a classic function that you'll find in sklearn (like AUC).

I have a feeling that some subtleties in the implementation may have an impact on the final outcome of the metric. That's probably not the case, but still, a look at the implementation would dispel those doubts.


Second question: How large a data sample is the metric calculated on the current leaderboard?

Thanks and have a nice day!

RE: metric code
by andrzej - Tuesday, March 12, 2024, 12:57:05

Hello,

The function in R language used for the evaluation is:

tot_cost <- function(preds, gt, costs = cost_matrix){
conf_tab <- table(factor(preds, levels = c("-1", "0", "1")), gt)
sum(conf_tab * costs)/length(gt)
}

The sample size for calculating the leaderboard score is only 10% of the test data size, thus you should expect some instability in the Leaderboard evaluation results.

Best of luck,
Andrzej Janusz

RE: metric code
by dymitrruta - Wednesday, March 13, 2024, 20:25:33

If one wants to compute it lightning fast here is a much simpler way: cost=mean(abs(preds-gt)), ;)