6 months, 3 weeks from now

Recruitment Challenge @ QED Software

This is an internal data mining challenge at QED Software, aiming at checking ML skills of new employees.

Overview

At QED Software, we use the KnowledgePit platform to challenge the data science community members to solve real business problems and test their skills, knowledge, and the most important - creative thinking. This challenge is not a competition but, first and foremost, a place for self-evaluation. If you have been invited to take part in it, you are likely a motivated data scientist. But what can you do with what we have prepared? Enter the challenge and check your skills! 

What to know more about what is ahead of you? The task is to detect truly suspicious events and false alarms within network traffic alert data that the Security Operations Center (SOC) Team members have to analyse daily. An efficient classification model should help the SOC Team to optimise their operations significantly. Technical details are in the Task Description section.

We wish you good luck and satisfaction with your solution(s)!

This challenge is based on a data mining competition (Suspicious Network Event Recognition) organized in association with IEEE BigData 2019 conference. Check the original competition here.

Terms & Conditions
 
 
Please log in to the system!

In this challenge, the task is to detect truly suspicious events and false alarms within the set of so-called network traffic alerts that the Security Operations Center (SOC) Team members have to analyze daily. An efficient classification model could help the SOC Team to optimize their operations significantly. This data set comes from IEEE BigData 2019 Cup: Suspicious Network Event Recognition challenge.

The data set available in the challenge consist of alerts investigated by a SOC team at the Security on Demand company (SoD). We call such signals 'investigated'. Each record is described by various statistics selected based on experts' knowledge and a hierarchy of associated IP addresses (anonymized), called assets. For each alert in the 'investigated alerts' data tables, there is a history of related log events (a detailed set of network operations acquired by SoD, anonymized to ensure the safety of SoD clients).

The data sets cover half a year between October 1, 2018, and March 31, 2019. You can find the description of columns from the 'investigated alerts' data in a separate file called column_descriptions.txt. We divided the main data into a training set and a test set based on alert timestamps. The training set (the file cybersecurity_training.csv) utilizes approximately four months, and the remaining part constitutes a test set (the file cybersecurity_test.csv). The format of those two files is the same - columns are separated by the vertical line '|' sign. However, the target column called 'notified' is missing in the test data.

The task and the format of submissions: the job is to predict which of the investigated alerts were considered truly suspicious by the SOC team and led to issuing a notification to SoD's clients. In the training data, this information is indicated by the column 'notified'. A submission should have a form of scores assigned to every record from the test data - each score in a separate line of a text file. You can find an example of a correctly formatted submission file in the Data files section.

Evaluation: we evaluate the quality of submissions using the AUC measure. The assessment is automatic. The preliminary results are published on the public leaderboard.

___________

You may find it useful to explore the publications linked to the original competition:

A. Janusz, D. Kałuza, A. Chadzynska-Krasowska, B. Konarsk, J. Holland, D. Slezak: IEEE BigData 2019 Cup: Suspicious Network Event Recognition. BigData 2019.

Q. H. Vu, D. Ruta, and L. Cen, “Gradient Boosting Decision Trees for Cyber Security Threats Detection Based on Network Events Logs,” in 2019 IEEE International Conference on Big Data, BigData 2019, Los Angeles, CA, USA, December 9-12, 2019, 2019.
 
C. Dongy, Y. Chen, Y. Zhang, B. Jiang, S. Liu, D. Han, and B. Liu, “An Approach For Scale Suspicious Network Events Detection,” in 2019 IEEE International Conference on Big Data, BigData 2019, Los Angeles, CA, USA, December 9-12, 2019, 2019.
 
T. Wang, C. Zhang, Z. Lu, D. Du, and Y. Han, “Identifying Truly Suspicious Events and False Alarms Based on Alert Graph,” in 2019 IEEE International Conference on Big Data, BigData 2019, Los Angeles, CA, USA, December 9-12, 2019, 2019.
 

In order to download competition files you need to be enrolled.
Rank Team Name Score Submission Date
1
mishka
0.9173 2022-08-11 01:25:15
2
QchallengED
0.9136 2021-11-1 11:33:05
3
haribo
0.9093 2022-03-19 12:01:49
4
neil
0.9032 2022-04-11 16:23:58
5
antag
0.9028 2021-11-11 14:55:26
6
lucy789
0.9027 2022-02-7 14:39:14
7
Work4QEDSoftware
0.9023 2021-10-24 16:12:43
8
ayyoletsgo
0.9015 2022-05-31 23:38:14
9
DeepTeam
0.8988 2022-05-18 17:36:52
10
random123
0.8987 2022-03-31 21:44:14
11
qualify123
0.8947 2022-03-18 10:26:51
12
startoni
0.8940 2021-06-20 23:32:37
13
jentetil
0.8849 2022-08-10 11:58:59
14
n3o2k7i8ch5
0.8833 2021-11-20 15:22:10
15
RD
0.8796 2022-01-29 11:46:36
16
monicatraveler
0.8652 2022-08-10 00:53:33
17
Kolumbryna_23
0.8590 2021-05-20 22:49:51
18
cenic
0.8578 2021-06-20 14:17:59
19
BalboaXx
0.8569 2021-08-9 14:21:27
20
Llynd
0.8453 2022-04-12 22:06:41
21
Lipski
0.8197 2021-06-20 17:35:04
22
PeterM
0.7733 2022-08-8 20:29:06
23
QED_recruitment
0.7717 2021-12-12 22:53:41
24
Pierre
0.7696 2022-08-7 13:22:22
25
Rainbow
0.7591 2022-04-16 13:21:56
26
Vrael
0.7112 2021-07-7 22:11:31
27
Przemysław Potok
0.6899 2022-01-14 16:50:32
28
niekoniecznie
0.6829 2022-04-12 12:50:34
29
SuperLog
0.6418 2022-08-10 14:41:06
30
the_coffee_team
0.6093 2021-11-19 16:23:07
31
Mafii
0.5953 2021-11-7 23:25:34
32
Sam
0.5596 2021-07-7 22:24:31
This forum is for all users to discuss matters related to the competition. Good manners apply!
There is no topics in this competition.