The first post is going to be a quick overview of my normal workflow and what tools I use.
Firstly, welcome! Thanks for dropping by. Hopefully you will find something useful. If you want something explained in more detail, add a comment or send me an email. Happy to help.
Now... when I say tools, I don't mean 'point and click'. I am not against them but usually if I am looking at it in hex, it is due to automated tools not extracting the data I need. It is also harder to explain how they work.
You can choose what tools you like but the main thing is that your are comfortable with them and can use them quickly.
Workflow
The basic workflow goes like this:
1. What is this?
A big blob of data with juicy stuff inside. Excited? Me too. Look at all that HEX! Gigabytes of it!
2. What are we looking for?
Pictures, videos, documents, SMS, MMS, chat, SQLite databases, web searches etc
Knowing what we are looking for will give us information such as headers footers, tags that we can search for.
3. What am I looking in?
Is this a file, a copy of a micro SD Card, a Hard Disk Drive DD image, a raw NAND chip dump.
This will help us know if the data is contiguous, what the sector,page,block sizes are, if the data needs to be reordered. It will help us to know if the filesystem is FAT32, NTFS, EXT4 etc
4. Is that data active or deleted?
If the data is active, then we can use a filesystem approach to find it (that is not really a topic for here but more details later).
If the data has been deleted, how long ago was it deleted? How big is the disk/memory, how full is the disk, how much was it used since the data was deleted?
5. Let's try looking manually
The reason we are looking manually is usually due to fragmentation, incomplete file finalisation or partial overwriting. Using our hex editor we search for tags/headers/footers to try an identify similar patterns or files structures.
Can we try and 'carve' out a file that can be viewed. Is the data fragmented, has it been partially overwritten? Do we need to build a new file? Do we have similar files from the same device?
6. Now let's automate this.
Once we have done this manually we can now write a script to automate this process. Sometimes we are only looking for one file or piece of data but often it will be many or we will get a similar job again so it is worth putting in the few minutes to script a semi-reusable solution.
It would be nice to say this is the last step but there is a continuous loop between step 5 and 6. As we automate it, a new case breaks it, we adjust and automate ....
I am code language agnostic and have programmed in languages such as c64 basic, Fortran, Spice, various database 'languages', C, C++,VB, java, Matlab, assembler and Python.
I currently like to use Python due to simplicity, readability, support (where would I be without stackoverflow.com), rapid development, price (Gratis is good), licensing, cross platform support, easy GUI support and easy deployment (We can package it up as an exe if we need to distribute it stand alone- this saves the 'oh it's missing a module!' or 'how do i run it?' dilemma that turn a lot of people off from running code. (Setting this up simply is planned for about post 8.. so stay tuned)
I am also OS agnostic, PC, Mac, Linux, DSP on embedded ARM.. bring it on.
The code in the coming blogs will be using Python but as it is almost pseudo code, you can convert it to your language of choice. I am not up for a debate of which language is best.
The code is written to be understood, I am not here to show off how I reduced 8 lines of code into 1 and now no one can understand except it Dr Smarty McSmarty or how using a different instruction or module runs 13.6% faster. We can optimise later if we need to. Let's just get something working quickly so our brains can think about the problem and not be bogged down in syntax issues.
OK so let's get started!
Tools:
1. FTK Imager (free and simple to get 'forensic' copies of data like SD cards or Hard Disk Drives etc.)
2. Hex Editor (The next post will go over which ones I use and like)
3. Python (usually use 2.7 due to code base and support out there but also am tinkering with 3)
And that's it! The results I have been able to get from these simple tools have surpassed anything commercial I have used and the difference is I get to understand it too. Which makes the next job/project easier... well, I keep telling myself that.
Until the next post TheHexNinja says:
Gentle deer drinks dew
Forest awakens new day
NINJA STAR TO NECK TO NECK