Lets face it, when it comes to passwords humans are predictable. Time and time again we see this when a site gets compromised. After the hashes have been sorted the top password are nearly always the same. It’s usually some variation of 1-9, password, iloveyou, princess, abc123, qwerty, 111111, secret, etc. Many of these passwords are unoriginal, and people use these same passwords across all of their accounts. Then their are the individuals that think they are being unique, and will change one of these common passwords to something slightly different for example [password –> Password123]. A dictionary tool with a rule engine can easily find these types of passwords. Up until now all these dictionary based/ rule engine tools have mostly CPU based (hashcat & JTR for example). Using a dictionary with 14 million passwords and a rule file with only 3000 rules could take as long as 15 – 20 minutes with a CPU based tool. That same 14 million password wordlist can take about 2 hours using my custom 23k rule set(preloaded in the /rules folder). This is where oclHashcat+/cudaHashcat+ come in.
oclHashcat+/cudaHashcat+ which I will call “oclHashcat+” for the sake of simplicity in this article is a completely new tool, separate from regular oclHashcat. oclHashcat+ is a dictionary based tool with a full featured rule engine that all run completely on the GPU. In our above example a 23k rule file combined with a 14 million word dictionary would take about 2 hours running on a modern CPU at about 50M password attempts /sec. With oclHashcat+ this same same rule set and dictionary file takes as little as 3.5 minutes using a dual GTX 480 system running at about 2B password attempts /sec. That’s right folks 50M for the CPU vs 2000M on the GPU. But that’s not all, oclHashcat+ offers us a couple of new features which I’ll be going over here.
The first thing to point out is that there are now four separate .bin/.exe files. This is to cover the various different system combinations that people may be running. The switches and syntax for all of these different versions are the same, however depending on your GPU, OS and OS architecture your running will depend on which one you’ll be using. For example someone with Ubuntu 64 running a Nvidia card would use cudaHashcat+64.bin while someone running a Windows 32 bit system with an ATI card would run oclHashcat+32.exe. I find the easiest way to get rid of the clutter in your folder is to delete those extra .bin/.exe files. Since I’m running Ubuntu with nvidia cards I delete all the exe files, and ocl prefixed bin files. Just in case your complete baffled by this here is a simple cross reference table:
|Windows system||Linux system|
|ATI w/ 32bit OS||oclHashcat+32.exe||oclHashcat+32.bin|
|ATI w/ 64bit OS||oclHashcat+64.exe||oclHashcat+64.bin|
|nVidia w/ 32bit OS||cudaHashcat+32.exe||cudaHashcat+32.bin|
|nVidia w/ 64bit OS||cudaHashcat+64.exe||cudaHashcat+64.bin|
The only exception being that 64bit OS’s can run the 32bit versions (oh though I don’t know why you’d want to). The reason for the forking of the project into separate CUDA and OCL variations comes from a programming standpoint, certain optimizations can only be done using vendor specific code. This gives us a nice performance increase because the code is better optimized to support that architecture.
oclHashcat+ uses a lot of similar switches and syntax as the other Hashcat tools. The attack mode numbers such as 0 for MD5, 1000 for NTLM, 1100 for DCC are all the same. If you run the –help switch it will list all the various switches.
- One thing you may notice is that a few more attack modes are supported now that have only been supported on the CPU version. The attack modes -m 400 covers MD5(wordpress and phpBB3) and -m 500 covers MD5(unix). These algorithms are generally difficult hash types to attack due to the high iteration counts that a plaintext goes through while getting hashed. High iteration algorithms may reduce the overall attacking speed of a hash but it doesn’t do anything to protect those passwords that users always tend to use, even if those passwords are only slightly different from the norm.
- Support for standard in – For those that have already built special scripts or have various other ways to mangle or generate wordlist this is a way to continue to use those. In addition since oclHashcat+ doesn’t support bruteforce style attacks directly you can use the maskprocessor from hashcat-utilities to generate a bruteforce character set and pipe it to stdin.
- GPU based rule engine – oclHashcat+ supports nearly all of the same rules that hashcat supports except for the @ modifier. That means any of the rule list you may have built before to use in Hashcat will also work. You can also generate random rules using the -g switch. -g 1000 would generate 1000 random rules.
Usage & Conclusion:
Here is an example usage and output:
./cudaHashcat+64.bin -r rules/d3ad0ne_23.8K.rule -n160 -o found.out example0.hash /root/dict/rockyou.txt
Platform: NVidia compatible platform found
Device #1: GeForce GTX 480, 1535MB, 1520Mhz, 15MCU
Device #2: GeForce GTX 480, 1535MB, 1520Mhz, 15MCU
Device #1: Kernel ./kernels/4318/m0000.sm_20.64.cubin
Device #2: Kernel ./kernels/4318/m0000.sm_20.64.cubin
Starting attack in wordlist_mode...
p: 553/6494, cs: 1, cr: 1920, cl: 8, rt: 1546.73ms, s:1627030.75k/s
p: 657/6494, cs: 1, cr: 3968, cl: 8, rt: 1200.40ms, s:2096458.25k/s
p: 708/6494, cs: 1, cr: 6016, cl: 8, rt: 1214.36ms, s:2072357.88k/s
p: 740/6494, cs: 1, cr: 8064, cl: 8, rt: 1224.98ms, s:2054383.12k/s
Here we can see that I used the -r switch to specify a rule file, I set the workload to 160 using -n, saved the found hashes in the found.out file using the -o, example0.hash is our hashlist and rockyou.txt is the dictionary file. One thing to note you may have seen a lot of “Skipping invalid or unsupported rule in line ###” This is typical, as oclHashcat+ does not support all of the same rules as regular hashcat. It simply skips the rules but will use all the ones that it does support.
The status output means the following – p:progress, cs:current salt, cr:current rule, cl:current (plain) length, rt:runtime, s:speed.
Something to keep in mind is that for non iterated hashes such as md5, ntlm, md4 you need to keep the GPU constantly busy. That is to say that if you were to try to crack MD5 with no rules using a standard dictionary oclHashcat+ speeds would be very low, this is because the GPU can calculate the MD5 sum of a dictionary word much faster than that word can be sent through the bus to the GPU to be processed. That’s why it’s suggested to use at least several hundred rules. The rules act like iterations, constatly keeping the GPU busy.
Check back soon as I’ll cover some helpful tips and techniques.
Original post: http://ob-security.info/?p=211