Data breach classification is a part of BreachDirectory.
Data breach classification is made possible by using classifiers that can provide strong assumptions in regards to what’s going to happen next in the data breach world by looking at data.
Simply put, this classifier learns from data. The more data is provided, the more accurate the calculations get.
The following table depicts the probability of email domain usage in the next data breach:
# | Email Domain | Frequency | Purpose / Country | Probability of the email domain being included in the next data breach |
---|---|---|---|---|
1 | .com | 390,764,619 | Commercial / United States | 52.638% |
2 | .ru | 203,283,285 | Russia | 27.383% |
3 | .de | 47,206,838 | Germany | 6.359% |
4 | .fr | 26,092,163 | France | 3.515% |
5 | .net | 20,099,487 | Network Infrastructure | 2.708% |
6 | .it | 12,471,385 | Italy | 1.680% |
7 | .uk | 12,171,065 | United Kingdom | 1.640% |
8 | .pl | 6,574,020 | Poland | 0.885% |
9 | .cz | 4,367,424 | Czech Republic | 0.588% |
10 | .es | 2,402,534 | Spain | 0.324% |
11 | .ua | 2,230,253 | Ukraine | 0.300% |
12 | .in | 1,984,241 | India | 0.267% |
13 | .ca | 1,323,750 | Canada | 0.178% |
14 | .br | 1,026,632 | Brazil | 0.138% |
15 | .hu | 985,678 | Hungary | 0.133% |
16 | .nl | 919,529 | The Netherlands | 0.124% |
17 | .by | 721,069 | Belarus | 0.097% |
18 | .at | 657,410 | Austria | 0.090% |
19 | .mx | 594,053 | Mexico | 0.080% |
20 | .bg | 587,733 | Bulgaria | 0.079% |
21 | .sk | 531,944 | Slovakia | 0.072% |
22 | .be | 494,773 | Belgium | 0.067% |
23 | .ch | 393,937 | Switzerland | 0.053% |
24 | .jp | 387,647 | Japan | 0.052% |
25 | .gr | 373,066 | Greece | 0.050% |
26 | .pt | 350,828 | Portugal | 0.047% |
27 | .my | 338,588 | Malaysia | 0.046% |
28 | .lv | 314,852 | Latvia | 0.042% |
29 | .se | 294,974 | Sweden | 0.040% |
30 | .au | 256,217 | Australia | 0.035% |
31 | .dk | 231,435 | Denmark | 0.031% |
32 | .cn | 222,233 | China | 0.030% |
33 | .fm | 196,084 | Radio Station | 0.026% |
34 | .eu | 183,747 | European Union | 0.025% |
35 | .mil | 139,125 | Military | 0.019% |
36 | .za | 124,452 | South Africa | 0.017% |
37 | .nz | 123,817 | New Zealand | 0.017% |
38 | .no | 123,090 | Norway | 0.017% |
39 | .ie | 97,414 | Ireland | 0.013% |
40 | .coid | 93,451 | Indonesia | 0.013% |
41 | .co | 91,313 | Colombia | 0.012% |
42 | .hr | 77,714 | Croatia | 0.010% |
43 | .ee | 77,239 | Estonia | 0.010% |
44 | .kr | 73,829 | South Korea | 0.010% |
45 | .lt | 72,849 | Lithuania | 0.010% |
46 | .ry | 53,592 | Unknown | 0.007% |
47 | .il | 52,782 | Israel | 0.007% |
48 | .th | 50,560 | Thailand | 0.007% |
49 | .cl | 46,701 | Chile | 0.006% |
50 | .edu | 29,023 | Education | 0.004% |
We can see that email domain TLDs originating from the United States, Russia, Germany and France have the highest chance of being included in the next data breach. If we combine all of the entries originating from those five countries, we would get 679,818,290 records which would consume 84.76% of the entire Exploit.in user base.
# | Password | Frequency | Probability of the password being included in the next data breach |
---|---|---|---|
1 | 9,394,973 | 16.789% | |
2 | 123456 | 5,021,150 | 8.973% |
3 | 123456789 | 1,846,744 | 3.300% |
4 | qwerty | 1,348,258 | 2.409% |
5 | password | 1,013,304 | 1.811% |
6 | 823,741 | 1.472% | |
7 | 12345678 | 762,590 | 1.363% |
8 | abc123 | 761,558 | 1.361% |
9 | 111111 | 717,537 | 1.282% |
10 | password1 | 689,459 | 1.232% |
11 | 1234567 | 663,952 | 1.186% |
12 | 1234567890 | 635,681 | 1.136% |
13 | 123123 | 577,044 | 1.031% |
14 | 12345 | 571,052 | 1.020% |
15 | 000000 | 512,949 | 0.917% |
16 | 1q2w3e4r5t | 502,213 | 0.897% |
17 | iloveyou | 420,894 | 0.752% |
18 | qwertyuiop | 358,704 | 0.641% |
19 | 1234 | 333,871 | 0.597% |
20 | dragon | 300,340 | 0.537% |
21 | monkey | 298,395 | 0.533% |
22 | 123456a | 257,989 | 0.461% |
23 | 123321 | 255,627 | 0.457% |
24 | 1qaz2wsx | 244,652 | 0.437% |
25 | 654321 | 230,337 | 0.412% |
26 | 666666 | 229,491 | 0.410% |
27 | 123qwe | 227,036 | 0.406% |
28 | myspace1 | 211,332 | 0.378% |
29 | target123 | 205,930 | 0.368% |
30 | tinkle | 205,419 | 0.367% |
31 | 121212 | 205,296 | 0.367% |
32 | 1q2w3e4r | 203,926 | 0.364% |
33 | 7777777 | 203,185 | 0.363% |
34 | 1g2w3e4r | 201,371 | 0.360% |
35 | gwerty | 201,269 | 0.360% |
36 | zag12wsx | 201,062 | 0.359% |
37 | gwerty123 | 200,969 | 0.359% |
38 | qwe123 | 194,053 | 0.347% |
39 | zxcvbnm | 187,142 | 0.334% |
40 | qwerty123 | 175,965 | 0.314% |
41 | 1q2w3e | 172,074 | 0.307% |
42 | qazwsx | 170,280 | 0.304% |
43 | 123 | 169,770 | 0.303% |
44 | 222222 | 167,009 | 0.298% |
45 | 555555 | 166,135 | 0.297% |
46 | 123abc | 162,971 | 0.291% |
47 | asdfghjkl | 159,926 | 0.286% |
48 | 987654321 | 156,994 | 0.281% |
49 | a123456 | 152,732 | 0.273% |
50 | qwerty1 | 151,323 | 0.270% |
We can see that passwords that are empty would have the highest chance of being included in the next data breach – such passwords consume 18.261% of Exploit.in’s entire user base which would be around 146,462,503 records – we could guess that these passwords got “lost in encoding” or contained some unknown characters.
Including the “empty” passwords, the top 5 passwords that could be used in the next data breach include 18,624,429 passwords which would consume around 2.32% of the entire Exploit.in’s user base.
The Exploit.in data breach compilation is one of the largest data breach compilations ever – it is compiled of many data breaches into information systems. The classifier shows that users coming from western part of Europe have the highest chance to also have their data stolen in the upcoming data breaches – the classifier also shows that users who use passwords like “” (though this may be an encoding problem) and “123456” also have pretty high chances of their identities being stolen in the upcoming data breaches.
There have been rumors about a data breach targeting Schneider Electric. Did a data breach…
There have been rumors about the Fiskars Group – the company behind Fiskars scissors and…
Russia has fined Google more than two undecillion roubles because Google has refused to pay…
Why does RockYou 2024.txt look like a binary file when you open it up? Find…
Duolicious is a dating app that connects people who are “chronically online.” Did the Duolicious…
This blog will tell you what RockYou 2024 is, how RockYou 2024.txt came to be,…