Tuesday, July 26, 2016

Hex Editors Phoaar

The Hex Editor

OK, so our basic tool on this journey is the humble hex editor. But all is not so simple.  There are a plethora of hex editors available. Basically we want to be able to highlight an area of interest, save.... view...save.. copy...paste.. cut..repeat....

The basic features you will be using a lot of are
  • Search: bytes in hex, locate, count, index, export address
  • Goto: both absolute and relative address.
  • Select: nice if they are right click 'start', right click 'end'
  • Cut, Copy, Insert Paste, Overwrite Paste
  • Hex/Decimal: be able to switch between these easily
You will be doing these functions alot! So choose a hex editor that can do those functions easily or with shortcuts.

My favourite hex editors are (no affiliations or endorsements):


WinHex - Xways

Super fast, simple to use. All you really need for basic hex carving.
The basic personal licesne is ~$60 and well worth it.
For basic carving I really like the 'right click- beginning of block' , 'right click- end of block', Edit- Copy Block into new file - Walla.

WinHex Screenshot

Hex Workshop Screenshot
I like the coloured byte window....purrdy..., it is nice to help identify periodic patterns and you can pick up small changes in the data as you scroll through a file etc
License is $89.95
Copying and cutting blocks of data is a little cumbersome as you need to specify start address and either size or end address. Not a show stopper, but it does slow the Hex Ninja down when he has his flow on.

010 Editor
A bit more expensive but I like this one a lot for more complex operations and analysis
$129.95 or $49.95 for personal use
Has scripting capabilities and some nice file templates for parsing file structures

010 Editor Screenshot

Free Editors:

Notepad++ with the Hex Editor Plugin
Good if you like to keep the programming, hex editing all in one place.

Nice interface and has Mac version as well.

Although forensic tools have the ability to show the hex, the features are pretty limited (except for XWAYS -WinHex)

So.... What daily functions does Hex Ninja like to do in a hex editor?

The number one thing I do is seeing if a given file is intact, corrupted etc so by basically opening a file in a hex editor we get to see what it really like like and not what the file extension is labeling it as.

So open as many files as you can so you get to see the basic structure they have. If you first focus on JPG, PNG, MP4/MOV, AVI, DOC and PDF, you will be across most filetypes you want to recover, rebuild etc.
You will get so used to there structure and tags that you can recognise them in a stream of hex,

...there's way too much information to decode the Matrix. You get used to it, though. Your brain does the translating. I don't even see the code. All I see is blonde, brunette, redhead. Hey uh, you want a drink? -Cypher

For example the most common file the Hex Ninja sees is the common JPG or more correctly the JPEG File Interchange Format (JFIF).. The JPG is the file extension, the JFIF is the file container it is stored in. Lets hexinate a typical JPEG.

Hex View of JPEG

To do any basic carving we need to find the start of a file and the end of the file OR an embedded size so we can find the end. Let's take a quick look under the hood.

The basic structure in JFIF is a sequence of marker segments. Starting with FF followed by a byte defining the marker type. Depending on the marker there can be embedded data and nested marker segments. 

See https://en.wikipedia.org/wiki/JPEG for a basic overview or https://www.w3.org/Graphics/JPEG/itu-t81.pdf if you want to dig deeper.

The first 2 bytes 0xFFD8 indicate a 'Start Of Image' (SOI). 
If we just searched for the two bytes 0xFFD8 on a disk or 'unallocated space' we would produce to too many false hits. Generally the longer and more specific the search term the less false hits we will get, so two bytes is a little short so we will see what follows that we could use in a search term . 

The next two bytes 0xFFE0 indicate a 'JFIF APP0 marker segment'. which has embedded data such as the text 'JFIF'. While the 0xFFD8FFE0 is generally common across all cameras/phones I have seen a couple of cameras that didn't put the APP0 first but APP1 was first ie 0xFFD8FFE1 but that is rare so let's keep it simple.

Next we need to look for an embedded size or embedded file marker. 

Unfortunately there is no embedded size in the JFIF, We could technically decode the image as we carve to find the end but that it a bit more intense so lets start with finding the end. So we need to be looking for an end of file marker. In the JFIF specification it is End Of Image (EOI) 0xFFD9.... Really.. a two byte marker! That can lead to a lot of false positives. Why didn't they make it an 8 byte marker or even 4 or 6 bytes would be better! 

There are a couple of issues we should be aware of so we can try and avoid false positives in a search and carve: 

1. There can be embedded thumbnail/s inside the JFIF file that have the same SOI and EOI markers. Yep good thinking JPEG working group! We can generally avoid this by ignoring the EOI if it occurs too soon after the SOI. We can also carve out the thumbnails in a more thorough carve to be done in later blogs. 
2. If the end of the file has been overwritten we may not find the EOI marker until the end of another image. We can avoid this by limiting how far we search for the EOI after the SOI. 
3. The image data may be fragmented. That is, cluster size blocks of the data can be intermingled with  other files. Generally we do not know the location or sequence of the clusters. We will practise these in a later blog post.  

The marker 0xFFD9 should not occur in the file unless it is the EOI (of the main image or thumbnails), ie we should not find it in the compressed image data (OK JPEG working group, at least you thought of that).  

No back to our simple carve. We locate the 0xFFD9 indicating the end of the file.

JFIF EOI Marker 0xFFD9

So if we found what looked to be a JPEG in unallocated space or embedded in another file we can carve it out using the simple technique:
1. Search 0xffD8FFE0, mark the first byte as the start of the block.
2. Seacrh 0xFFD9, mark the last byte the end of the block.
3. Copy the block into a new file, save it with a .jpg extension and you will have a carved JPEG.

Until the next post TheHexNinja says:

Bamboo bends in wind
Ninja watches you alone

Wednesday, January 13, 2016


The first post is going to be a quick overview of my normal workflow and what tools I use. 

Firstly, welcome! Thanks for dropping by. Hopefully you will find something useful. If you want something explained in more detail,  add a comment or send me an email. Happy to help.

Now... when I say tools, I don't mean 'point and click'. I am not against them but usually if I am looking at it in hex, it is due to automated tools not extracting the data I need. It is also harder to explain how they work. 

You can choose what tools you like but the main thing is that your are comfortable with them and can use them quickly.


The basic workflow goes like this:

1. What is this? 
     A big blob of data with juicy stuff inside. Excited? Me too. Look at all that HEX! Gigabytes of it! 

2. What are we looking for?
     Pictures, videos, documents, SMS, MMS, chat, SQLite databases, web searches etc
     Knowing what we are looking for will give us information such as headers footers, tags that we can search for.

3. What am I looking in?
     Is this a file, a copy of a micro SD Card, a Hard Disk Drive DD image, a raw NAND chip dump.
     This will help us know if the data is contiguous, what the sector,page,block sizes are, if the data needs to be reordered. It will help us to know if the filesystem is FAT32, NTFS, EXT4 etc

4. Is that data active or deleted?
    If the data is active, then we can use a filesystem approach to find it (that is not really a topic for here but more details later).
    If the data has been deleted, how long ago was it deleted? How big is the disk/memory, how full is the disk, how much was it used since the data was deleted?

5. Let's try looking manually
    The reason we are looking manually is usually due to fragmentation, incomplete file finalisation or partial overwriting. Using our hex editor we search for tags/headers/footers to try an identify similar patterns or files structures.
    Can we try and 'carve' out a file that can be viewed. Is the data fragmented, has it been partially overwritten? Do we need to build a new file? Do we have similar files from the same device?


6. Now let's automate this.
    Once we have done this manually we can now write a script to automate this process. Sometimes we are only looking for one file or piece of data but often it will be many or we will get a similar job again so it is worth putting in the few minutes to script a semi-reusable solution.

It would be nice to say this is the last step but there is a continuous loop between step 5 and 6. As we automate it, a new case breaks it, we adjust and automate ....

I am code language agnostic and have programmed in languages such as c64 basic, Fortran, Spice, various database 'languages', C, C++,VB, java, Matlab, assembler and Python.

I currently like to use Python due to simplicity, readability, support (where would I be without stackoverflow.com), rapid development, price (Gratis is good), licensing, cross platform support, easy GUI support and easy deployment (We can package it up as an exe if we need to distribute it stand alone- this saves the 'oh it's missing a module!' or 'how do i run it?' dilemma that turn a lot of people off from running code. (Setting this up simply is planned for about post 8.. so stay tuned)

I am also OS agnostic, PC, Mac, Linux, DSP on embedded ARM.. bring it on.

The code in the coming blogs will be using Python but as it is almost pseudo code, you can convert it to your language of choice. I am not up for a debate of which language is best.

The code is written to be understood, I am not here to show off how I reduced 8 lines of code into 1 and now no one can understand except it Dr Smarty McSmarty or how using a different instruction or module runs 13.6% faster. We can optimise later if we need to. Let's just get something working quickly so our brains can think about the problem and not be bogged down in syntax issues.

OK so let's get started!


1. FTK Imager (free and simple to get 'forensic' copies of data like SD cards or Hard Disk Drives etc.)

2. Hex Editor (The next post will go over which ones I use and like)

3. Python (usually use 2.7 due to code base and support out there but also am tinkering with 3)

And that's it! The results I have been able to get from these simple tools have surpassed anything commercial I have used and the difference is I get to understand it too. Which makes the next job/project easier... well, I keep telling myself that.

Until the next post TheHexNinja says:

Gentle deer drinks dew
Forest awakens new day