An introduction to oclHashcat+ & cudaHashcat+ (by d3ad0ne via ob-security.info)

Introduction:

Lets face it, when it comes to passwords humans are predictable. Time and time again we see this when a site gets compromised. After the hashes have been sorted the top password are nearly always the same. It’s usually some variation of 1-9, password, iloveyou, princess, abc123, qwerty, 111111, secret, etc. Many of these passwords are unoriginal, and people use these same passwords across all of their accounts. Then their are the individuals that think they are being unique, and will change one of these common passwords to something slightly different for example [password –> Password123]. A dictionary tool with a rule engine can easily find these types of passwords. Up until now all these dictionary based/ rule engine tools have mostly CPU based (hashcat & JTR for example). Using a dictionary with 14 million passwords and a rule file with only 3000 rules could take as long as 15 – 20 minutes with a CPU based tool. That same 14 million password wordlist can take about 2 hours using my custom 23k rule set(preloaded in the /rules folder). This is where oclHashcat+/cudaHashcat+ come in.

Application:

oclHashcat+/cudaHashcat+ which I will call “oclHashcat+” for the sake of simplicity in this article is a completely new tool, separate from regular oclHashcat. oclHashcat+ is a dictionary based tool with a full featured rule engine that all run completely on the GPU. In our above example a 23k rule file combined with a 14 million word dictionary would take about 2 hours running on a modern CPU at about 50M password attempts /sec. With oclHashcat+ this same same rule set and dictionary file takes as little as 3.5 minutes using a dual GTX 480 system running at about 2B password attempts /sec. That’s right folks 50M for the CPU vs 2000M on the GPU. But that’s not all, oclHashcat+ offers us a couple of new features which I’ll be going over here.

The first thing to point out is that there are now four separate .bin/.exe files. This is to cover the various different system combinations that people may be running. The switches and syntax for all of these different versions are the same, however depending on your GPU, OS and OS architecture your running will depend on which one you’ll be using. For example someone with Ubuntu 64 running a Nvidia card would use cudaHashcat+64.bin while someone running a Windows 32 bit system with an ATI card would run oclHashcat+32.exe. I find the easiest way to get rid of the clutter in your folder is to delete those extra .bin/.exe files. Since I’m running Ubuntu with nvidia cards I delete all the exe files, and ocl prefixed bin files. Just in case your complete baffled by this here is a simple cross reference table:

  Windows system Linux system
ATI w/ 32bit OS oclHashcat+32.exe oclHashcat+32.bin
ATI w/ 64bit OS oclHashcat+64.exe oclHashcat+64.bin
nVidia w/ 32bit OS cudaHashcat+32.exe cudaHashcat+32.bin
nVidia w/ 64bit OS cudaHashcat+64.exe cudaHashcat+64.bin

The only exception being that 64bit OS’s can run the 32bit versions (oh though I don’t know why you’d want to). The reason for the forking of the project into separate CUDA and OCL variations comes from a programming standpoint, certain optimizations can only be done using vendor specific code. This gives us a nice performance increase because the code is better optimized to support that architecture.

Features:

oclHashcat+ uses a lot of similar switches and syntax as the other Hashcat tools. The attack mode numbers such as 0 for MD5, 1000 for NTLM, 1100 for DCC are all the same. If you run the –help switch it will list all the various switches.

  • One thing you may notice is that a few more attack modes are supported now that have only been supported on the CPU version. The attack modes -m 400 covers MD5(wordpress and phpBB3) and -m 500 covers MD5(unix). These algorithms are generally difficult hash types to attack due to the high iteration counts that a plaintext goes through while getting hashed. High iteration algorithms may reduce the overall attacking speed of a hash but it doesn’t do anything to protect those passwords that users always tend to use, even if those passwords are only slightly different from the norm.
  • Support for standard in – For those that have already built special scripts or have various other ways to mangle or generate wordlist this is a way to continue to use those. In addition since oclHashcat+ doesn’t support bruteforce style attacks directly you can use the maskprocessor from hashcat-utilities to generate a bruteforce character set and pipe it to stdin.
  • GPU based rule engine – oclHashcat+ supports nearly all of the same rules that hashcat supports except for the @ modifier. That means any of the rule list you may have built before to use in Hashcat will also work. You can also generate random rules using the -g switch. -g 1000 would generate 1000 random rules.

Usage & Conclusion:

Here is an example usage and output:
./cudaHashcat+64.bin -r rules/d3ad0ne_23.8K.rule -n160 -o found.out example0.hash /root/dict/rockyou.txt
...
...
Rules: 22656
Platform: NVidia compatible platform found
Device #1: GeForce GTX 480, 1535MB, 1520Mhz, 15MCU
Device #2: GeForce GTX 480, 1535MB, 1520Mhz, 15MCU
Device #1: Kernel ./kernels/4318/m0000.sm_20.64.cubin
Device #2: Kernel ./kernels/4318/m0000.sm_20.64.cubin


Starting attack in wordlist_mode...


p: 553/6494, cs: 1, cr: 1920, cl: 8, rt: 1546.73ms, s:1627030.75k/s
p: 657/6494, cs: 1, cr: 3968, cl: 8, rt: 1200.40ms, s:2096458.25k/s
p: 708/6494, cs: 1, cr: 6016, cl: 8, rt: 1214.36ms, s:2072357.88k/s
p: 740/6494, cs: 1, cr: 8064, cl: 8, rt: 1224.98ms, s:2054383.12k/s

Here we can see that I used the -r switch to specify a rule file, I set the workload to 160 using -n, saved the found hashes in the found.out file using the -o, example0.hash is our hashlist and rockyou.txt is the dictionary file. One thing to note you may have seen a lot of “Skipping invalid or unsupported rule in line ###” This is typical, as oclHashcat+ does not support all of the same rules as regular hashcat. It simply skips the rules but will use all the ones that it does support.

The status output means the following – p:progress, cs:current salt, cr:current rule, cl:current (plain) length, rt:runtime, s:speed.

Something to keep in mind is that for non iterated hashes such as md5, ntlm, md4 you need to keep the GPU constantly busy. That is to say that if you were to try to crack MD5 with no rules using a standard dictionary oclHashcat+ speeds would be very low, this is because the GPU can calculate the MD5 sum of a dictionary word much faster than that word can be sent through the bus to the GPU to be processed. That’s why it’s suggested to use at least several hundred rules. The rules act like iterations, constatly keeping the GPU busy.

oclHashcat+ can be downloaded here
hashcat-utillites can be downloaded here

Check back soon as I’ll cover some helpful tips and techniques.

Original post: http://ob-security.info/?p=211

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s