Sunday, April 18, 2021

Getting hashes to Virus Total from an Isolated Virtual Machine

Sometimes when I am testing in a Virtual Machine (VM) I really lock down the isolation. 

No shared folders. 

No bidirectional clipboards. 

No network. 

I may be paranoid but it is 'mildly discomforting' to see malware (ransomware) under test, encrypt your shared folder and then your host AV or Bitdefender start to lose it with Virus detections. 

This usually doesn't happen but when it does you can have a cold sweat moment that somehow the malware has not only jumped to a shared folder and doing what it does best. It is normally just a detection of the encrypted file or ransom note but once I have transferred the files for testing it is a good idea to check and double check your isolation. 

At a first pass when looking for suspected malware dll or exe files I like to upload the hash or suspicious files to Virus Total or Hybrid Analysis


However with an isolated system I am also limited by how to check the hash. I can't copy it across from the VM guest to host or check directly in a browser as I have isolated my VM. 

During this last year of Covid-19 I have used more QR codes than I have ever have so I had a thought to create a script that calculates the hash and generates a QR code that embeds the hash in the url so it will redirect to a prefilled Virus Total or Hybrid Analysis. 

I can then get the script to show the QR code on the screen and I can capture it in the host or even use a mobile phone to capture the QR to a browser .

Normally, I code in Python but thought I would punish myself and see if I could do it in Python3 and C#.

Python 3 

The python code uses a QR code generating library pyqrcode  and the hashlib library.

These can be installed using pip

>pip install PyQRCode
>pip install hashlib

The general functional flow is 
1. Get filename from argument
2. Calculate SHA256 hash
3. Append SHA256 hash to url string ie ''+ sha256_hash
4. Generate and display the QR code of this url

The python script is called from the command line using the suspicious file as an argument  to call the function with the suspect file 
> python3 c:\abc.exe

import pyqrcode
import argparse
import hashlib
import os

BUF_SIZE = 1048576 

def calc_hashes(filename):
    md5 = hashlib.md5()
    sha256 = hashlib.sha256()    
    with open(filename, 'rb') as fp:
        while True:
            data =
            if not data:
    return md5.hexdigest().upper(),sha256.hexdigest().upper()

# input file to create sha256 hash
parser = argparse.ArgumentParser()
args = parser.parse_args()
vt_url=''+ sha265_hash
qr = pyqrcode.create(vt_url)

This QR code image will pop up in the image viewer and we can capture it with a phone camera app or QR code scanner.

The linked URL will then open up and we can see this was a Wannacry malware. 


The C# program uses two libraries, System.Security.Cryptography to calculate the hashes and ZXing to create the QR code.

Unlike the Python version this C# requires a location to store the image that we parse to the command line program. A memory only version is underway but it is a little more complicated.

>qr_hash.exe C:\tmp\123.txt C:\tmp\123.jpg

The workflow is much the same as the python version except that it saves the QR image as a JPG then it uses a shell process to open the image in the default image viewer.

using System;
using System.Security.Cryptography;
using System.IO;
using ZXing;
using System.Drawing;
using System.Diagnostics;

namespace QR_Hash
    class Program
        static void Main(string[] args)
            if (args.Length == 2)
                string filenpath = args[0];
                string imagepath = args[1];
                string hash_string;

                if (File.Exists(filenpath) == true)
                    using (var sha256 = SHA256.Create())
                        using (var stream = File.OpenRead(filenpath))
                            var hash = sha256.ComputeHash(stream);
                            hash_string = BitConverter.ToString(hash).Replace("-", "").ToLowerInvariant();
                    var QCwriter = new BarcodeWriter();
                    QCwriter.Format = BarcodeFormat.QR_CODE;
                    QCwriter.Options = new ZXing.Common.EncodingOptions
                        Width = 400,
                        Height = 400
                    string vt_url = "" + hash_string;
                    var result = QCwriter.Write(vt_url);
                    using (var g = Graphics.FromImage(result))
                    using (var font = new Font(FontFamily.GenericMonospace, 8))
                    using (var brush = new SolidBrush(Color.Black))
                    using (var format = new StringFormat() { Alignment = StringAlignment.Center })
                        int margin = 5, textHeight = 30;
                        var rect = new RectangleF(margin, result.Height - textHeight,
                                                  result.Width - 2 * margin, textHeight);
                        g.DrawString(vt_url, font, brush, rect, format);
                    var p = new Process();
                    p.StartInfo = new ProcessStartInfo(@imagepath)
                        UseShellExecute = true

This C# version also add a nice URL link to the bottom of the image 

So there you have it, 2 basic programs to help you get the hash out of a VM via the screen. Noice!

As the hex ninja says.

Finding malware now,
Is easy with QR codes,
Keep safe from malware 

Saturday, October 10, 2020

Capturing Windows Memory

It has been a while since my last post. Changing jobs pointed me in a different direction for a while but as George and Frank Constanza would say. "I'm back baby!"

I recently had to look into windows memory capture to do some offline analysis of running processes.

My normal 'goto' tool for taking a forensic image and memory capture is usually FTK Imager. It is pretty robust and ninja proof. 

You can copy the install directory to an external USB and it will run nicely as a portable version. When we run this it obviously loads into memory which be present when we capture the system memory.

I started to think of if there were any other tools that could do memory analysis and compare some of there features such as 

  1. Memory Footprint - smaller and less processes is better
  2. Portable - I don't really want to install it on the system in question
  3. Fast - Memory capture is often the first stage of a Incident Response so I it to be fast
  4. Access privilege required - do I need to be admin or can I run this a least privilege user.
  5. Stand alone - Do I need to buy the whole forensic suite or can I just get the memory capture tool
  6. Price - gratis is good but a low cost good tool is OK too.
  7. Easy of use - I don't want to fumble in the field with pesky undocumented command line switches.

While there has been numerous blogs on some of the available tools I was mainly interested on the footprint and speed. If the tool was loaded into memory the risk is that some of the data of interest may be popped out.

After some quick browsing it seems the current options are (in no order of preference):

  1. FTK Imager
  2. Belkasoft
  3. Magnet RAM
  4. Process Hacker
  5. Winen
  6. MDD
  7. Mandiant Memoryze
  8. WindowsSCOPE
  9. WinPmem
  10. Dumpit
The next step was to see if my google fu was able to find the memory capture applications as some of these have dropped on and off  hosting sites. 

FTK Imager

Used for forensic imaging and live viewing of disks but can also do memory capture.
Has the option to capture pagefile.sys at same time which is nice.
It does require you to install it first (not on your target machine) then copy the install directory to a USB for portable use.  

Time: 2m:37s
Memory: 11.6MB
Install Directory of FTK Imager

FTK Imager GUI

FTK Imager GUI options

FTK Imager Memory Footprint

Belkasoft - RAM-CAPTURER

Simple to use from a USB.
Time: 2m:22s
Memory: 7.7MB.

Belkasoft RAM Capturer Install directory

Belkasoft RAM Capturer GUI

Belkasoft RAM Capturer Memory Footprint

Magnet - RAM

Has the option to segment but otherwise pretty straightforward.
Time: 4m:01s
Memory: 6.8MB

Magnet RAM Install Directory

Magnet RAM GUI

Magnet RAM Memory Footprint

Process Hacker

While this is a powerful tool it is more granular than required and probably better for live analysis as it allows you to inspect individual processes and dump the memory used by them but not a total memory dump.

Process Hacker Install directory

Process Hacker GUI

Process Hacker Memory Footprint


I couldn't get this to work successfully. 😢


Mandiant Memoryze

This downloads as an msi for installing but it can be run from an USB without installing by using a command line option to install it onto a USB.

msiexec /a E:\Download\\MemoryzeSetup3.0.msi /qb TARGETDIR=E:\Memory_Acquisition\Mandiant_memoryze

It doesn't appear to have support after Win 7 so the testing of this one on hold.


This requires a $1 to try it registration but looking and the 1 year cost of  $7,699 for a single year decided not to pursue this. 


This app disappeared for a while and I was very keen to test it. A new version came back via the author Matt Suiche at but even though I created an account I could never login ?? and got a Failed to Fetch error when logging in. If anyone has tested a newer version let me know.

It does a capture in place so if you run it from an external USB make sure it is big enough for the capture as it doesn't allow you to select a destination location. 

Time: 2m:34s
Memory: 7.1MB
DumpIt Install Directory

DumpItcommand line

DumpIt Memory Footprint

Testing Summary

So the major features I was looking for were a small footprint, easy to use and speed. The table below shows a summary of the four tools that met our needs.

For speed, Belkasoft is slightly faster on my DELL laptop but it will depend on the system you are running it on. 

Magnet RAM has the smallest footprint at 6.8MB.

FTK Imager is also fast, with slightly larger footprint but it has more than just RAM capture functionality. It can also forensically acquire hard drives so if I wanted to also do a forensic disk image or forensically copy files it maybe easier to use this than changing programs. 

But, if I had to just do a memory capture Belkasoft or Magnet RAM might be a good choices.  
DumpIt may be a nice choice if I just wanted a simple double click and it stores it in the same directory. 

Now to analyse the memory captures.... that may be for another post.

Until the next post TheHexNinja says:

Memory Capture 
Easy When You Know Your Tools  
Now To Analyse 


1. Tool URLs
2. The following article describe some of the methods the memory applications use to obtain the dump in kernel mode: ZwOpenSection with ZwMapViewOfSection, MmMapIoSpace
and MmMapMemoryDumpMdl

Wednesday, January 31, 2018

Practical Exercise - Image Carving II - Python

In the last post we looked at how we can manually carve out a jpeg image from free 'space'. Good to know and OK to do if we have one or two but if we had thousands to carve...... it could take some time. We would then use some sort of Image Recovery Software but could we write our own??

Part of the reason for this blog was to demonstrate some Hex Ninja skills both manually and how we can write some simple scripts to automate some of these tasks.

The general process goes something like this:
1. First we find the artifact we are looking for.
2. Understand the layout of the artifact.
3. Manually try and carve out the artificat and make sure it works for all cases.
4. Write a script to automate the process.
5. Test the script and make sure it works.

The last blog post covered steps 1-3, this post will cover steps 4-5.

So the language we will be using is Python. It is very easy to program in and is my 'goto' language at the moment for getting something up and running fast.

Available from 

There are two versions available 2.7 and 3.6. See to check out the differences between them.

I mainly use 2.7 because of there are more code libraries and more support for debugging on sites like StackOverflow but we can test it on both and see if it works. Eventually I will move to Python3.

So download Python 2.7 for your OS (Mac/Windows/Linux) and follow the install instructions.

To make sure everything has intalled OK, go to a command prompt and type python.

Hopefully you see something similar to the above screenshot. The output should tell you what version you are using (2.7.12) if it is 32 or 64 bitand a Python command prompt >>>

In the tradition of your programming languages your first exercise is to print Hello World to the screen.
Python makes this very simple, type print ("Hello World") and you should see output like below.

To get back to the normal commad prompt hit hit Ctrl-Z and Enter.

There are two main ways of using python.
1. From the Python command prompt where we can type python commands direct. This is good for doing simple testing of instructions.
2. Running a python script, where we write the python commands in an editor, save it with the extension py and then we can execute it by typing at the command prompt python

We will be mainly use the second technique. We can use a a basic text editor such a notepad. My favourite editor is PyCharm from JetBrains
It has code hightlighting, code completetion, finds error and you can run your code from within the editor, but there are a plethora of editors. They can be a bit daunting to initially use but well worth it if you intend to code a lot. For simlicity we will just use a text editor.

So now we are ready to start coding.
But before we start coding let's think about what we want to achive.
1. We want to load a file.
2. We want to search the file for the JPEG start of frame header "FFD8FFE0" and the end of frame 'FFD9"
3. We then want to save the data between these markers to a file. Simples!

As we want to keep the code simple, we won't be doing any error checking. In a real production program, there is a lot of error checking making sure the file exists, the data is in the correct format etc etc and it can make looking at the code confusing, so we will just be doing the bare basics.

The first thing we add to our script is to tell python what modules we will be using. We will be using the module re . We will be using re (Regular Expressions) to do fast searches so we need to tell Python the load in that module using the import insstruction

We then hardcode in the Start/End of Frame tags we will be searching for. FFD8FFE0 and FFD9. The format of them may look a little strange but basically it is in a hex byte string format. ie each hex byte is preceed with \x. The reason we do this is because the the file we read in will be in that format so it is easier to search for these tags in this format.

import re

JPEG_SOF = b'\xFF\xD8\xFF\xE0'JPEG_EOF = b'\xFF\xD9
JPEG_EOF = b'\xFF\xD9' 

Next we want to read in our file we want to search through. We could pass in the filename as an argument but as we are trying to be simple we will hardcode the filename it into our code. We use the open command with the name of the file we are carving from. We will use the date file Carve1.bin from the previus blog.

We use the 'rb' format indicating we want to read 'r' a binary 'b' file. The open command returns a reference to out file call a file object we call file_obj. Next we read the whole file into a variable call data. Don't try this with a massive file. We will show in later posts files how to read in big files. We then want to close the file which releases the reference to it so other programs can access it. Also make sure the file Carve1.bin i is in the same directory as the python script, otherwise we have to add path information to the filename.


This seems all pretty straightforward.

No we have our data loaded in memory we can perform our search. This is where we use the re module. Basically we want to get a list of all the offsets in the data where we find our tags. The following commands returns a list of these offsets.

SOF_list=[match.start() for match in re.finditer(re.escape(JPEG_SOF),data)]
EOF_list=[match.start() for match in re.finditer(re.escape(JPEG_EOF),data)]

If we run the script so far we can check what we have found.

>>> SOF_list
>>> EOF_list

So we have found the SOF tag at byte offset 4696 and the EOF tag at 11747.

Now all that is left for us to do is to get the data between these offset and save it to a file. We will write the code assuming their could be more hits so we can loop through all the we can carve all the images in one go.

So we need a counter variable we will call i we use to go through the lists. We then use a for loop to go through the SOF_list. We then want to get the jpeg image data from the hex byte string we read in from the file. We can do it simply by subdata=data[start:end]. So now we have the data we just need to save it to a file. As before I like to name the file and include the start offset and end offset in the name of the file. We do this with 

Now we just open that file with the 'wb' - write binary format. We update i with i=i+1 to then refernce the next EOF_list offset. And we do a print statement to give some feedback to the user.

i=0for SOF in SOF_list:
    i=i+1    print ("Found an image and carving it to "+carve_filename)

 So that should do it. We can now save this file call it and run it.

 Great it works .. so lets check the carved file.

And we are done. A 17 line image carver!

Sunday, December 31, 2017

Practical Exercise - Image Carving

So who's ready to carve?

Or as Gordon would say " Let's Carve or F#!K OFF "

In the last post we talked about some simple carving of a JPEG image file using a hex editor.

Before we get to carried away we should practice a couple of simple carving of images from 'unallocated'. What do I mean by 'unallocated' I hear you ask well...

There are a couple of approaches to carving and recovering files from file systems.

Firstly is the "File System" approach. That is, we use the fileystem's knowledge of where the deleted file was to begin our journey of recovery.

For example, when a file is deleted in a FAT32 filesystem, the directory entry has the first byte of the entry overwritten with 'E5'. The directory entry still contains; the filename (minus the first character), the filesize and the first cluster number. These can be vital to assist in the recovery process.

For a valid file we could look up the cluster number in the FAT table and find all the fragments as each FAT entry points to the next cluster number.

However when a file is deleted the FAT table entries are zeroed so we cannot trace the file fragments. We will go through a worked example of this later.

The second technique for file recovery is to ignore the filesystem and treat the disk as one big block of data. We can either do this on the whole disk image or we can just export the unallocated portion of the disk. We can then use our knowledge of what type of file we are trying to recover to attempt to find the file/s in question.

So let's start with three simple image carves.

1. JPEG: Deleted, no thumbnails, not overwritten, unfragmented in free unallocated space.

2. JPEG: Deleted, no thumbnails, not overwritten, unfragmented in full unallocated space.

3. JPEG: Deleted, no thumbnails not overwritten, fragmented in unallocated space.

Carve 1

Download the bin file from the GitHub

In a hex editor search for FFD8FFE0.

We find a search hit at 0x1258

Select the beginning of block at the the start of the JPEG at 0x1258. Now we search for the end of the file with the hex FFD9.

The D9 of FFD9 is at end of the file is at offset 0x2DE4. We select this as the end of the block. Copy the block out to a new file. In the filename I like to include 3 things, the file I am carving from, the start and end offset. So lets call it Carve1_1258_2DE4.jpg and wallah... 

Carve 2

Download the bin file from the GitHub

Again we search for FFD8FFE0.

We find it at offset 13B6. In this second example we see that it is embedded in other data (other deleted or allocated files), this is more typical of what we might see.
Again we search for FFD9 for the end of file marker. It is at 0x2360. We select the block and copy it out into a new file. Carve2_13B6_2360.jpg.


This seems simple enough, just a search from the start and end and we a have carved two deleted files of the Hex Ninja.

Carve 3

Download the bin file from the GitHub

Opening this unallocated blob we see something interesting...

For those who like to common hexinate files it looks like an OLE Compound File (OLECF) that is used in Word, Powerpoint, Excel from 1997-2003.  They have a distinct 8 byte header D0CF11E0A1B11AE1. For more info have a look at

So this example looks like there is another file in the unallocated space. But we will concentrate on the JPEG we are searching for. So we search for FFD8FFE0 as before.

Interesting to note that it is on a nice byte boundary of 0x2000 ie  8192 bytes or 16 sectors of 512 bytes. This will be important later but let's move on to carving the JPEG. Search for FFD9. We find it at 0x4424. We save it as Carve3_2000_4424.jpg.

Huh, this doesn't seem right. The first part looks like the devilishly handsome you know who!! But what happened to the rest. So let's look back at our file we carved out. If we scroll up from the bottom we see some weird stuff. We see some references to a directory structure "theme/theme/themeManager.xml" ...

That stuff should not be in our JPEG. So here is our Aha moment... no not 'Take on me' Aha more like a 'that's interesting' Aha.
Aha - Take On Me (1985)
We saw the first part of unallocated was an OLE file then we found our JPEG but it looks like maybe  we have some of the OLE file mixed in our JPEG causing it to not decode properly. 

So now what could be happening. Perhaps FRAGMENTATION!!!. 

What is this fragmentation sorcery you speak of?

Well let's back up a bit first.

So for a new filesystem out of the box, we have a nice clean storage device. A new file would be stored in sequential blocks on a disk. We store a file in logical blocks called clusters. Each cluster is made up as of a number of the smallest traditional Hard Disk units called a sectors (512 bytes). The cluster is an arbitrary unit and is the smallest addressable unit the operating system can address. For example in a FAT32 filesystem a cluster may be 4 sectors (2048 bytes) or 8 sectors (8192 bytes) etc. 

So why isn't this fixed? 

Well mainly for a reason of a trade off. If the cluster size is too big we can waste a lot of space. e.g. if our cluster is 32kBytes and our file is 100 bytes we are wasting nearly 32kBytes (slack space).  But, if we make each cluster really small say, 1 sector, we run out of the maximum storage space pretty quickly as the size of the table to address all these sectors (FAT) becomes almost as big percentage of our storage e.g. a 2TB disk using a 1 cluster/sector would need 16GB of FAT to store all sectors addresses and there are 2 FATs on the disk for redundancy

So when we have many files and we delete some, create some new files, delete some more file our disk becomes fragmented. So when we go to save a file we have lots of gaps in our disk from the files that have been deleted and the operating system would like to reuse them. The FAT file system will store the sequential cluster number for each file e.g. 202,203,207,412,902 could be the non-sequential cluster numbers for a 5 cluster file. This is fine for an allocated file but what happens when the file is deleted. The directory entry has the first byte overwritten with E5, it also stores the first cluster number but the FAT entry is overwritten with zeros. 

This is OK for a deleted file that has sequential cluster numbers but for a typical file with non-sequential cluster numbers we are.... well... stuffed! 
The things we use for our advantage is to know the cluster size and the type of file we are searching for. The cluster size is good as we only need to look at the boundary of clusters for the file we are searching for. The file type is useful as we know what we are looking at. A text file or a ZIP file look very different in hex. 

Now back to our file. If we have a look at the highlighted section in our carved file, we remember that our OLE file was 0x2000 bytes long, that could be a clue for our cluster size 0x2000 is 8192 bytes or 16 sectors. This is a good clue that our cluster size of 16 or fraction of this maybe 8 or 4.

So looking back through our data Carve3.bin we see that if we step forward in multiples of 0x2000 bytes we see that if our assumption of a cluster size of 0x2000 were true that the second cluster looks strange. Prior to 0x8000 is a a run of all zeros which is not normal for a sequential run of a JPEG which usually has high entropy data.

So let's try maybe half the cluster size of 0x1000 or 4092 bytes (8 sectors). If we find the start of the JPEG be searching for FFD8FFE0 we found at 0x2000. We then search forward one 'trial cluster' of 0x1000 we find that there is no continuity of high entropy data we would normally see in the data part of a JPEG. So our initial assumption of a cluster size of 0x2000 was wrong. So let's move forward with a cluster size of 0x1000.

If we move forward from 0x3000 to 0x4000 we see some nice data that has high entropy again.
So it looks like our assumption of cluster size 0x1000 might be correct, so if we move forward another cluster 0x1000 we see we are not in JPEG type high entropy data anymore.
So maybe the JPEG finishes in this last cluster i.e from 0x3000 to 0x4000. So lets search forward from 0x4000 looking for FFD9 and we find a hit at 0x4424. 
So if we try making up a the file of:
0x2000 to 0x3000  and 
0x4000 to 0x4424
If we combine those parts we have a file Carve3_2000_3000_4000_4224.jpg. In a hex editor we simply copy the first part 0x2000 to 0x3000 to a file then we copy 0x4000 to 0x4224 and append that to our file. Now let's check the results.
Wow that looks good if I don't say so myself.... and my best profile too!

So that was quite a hexinating journey. So what did we cover.  Carving a sequential JPEG from unallocated space right up to a fragmented carve. Good work. What you have learnt is the basis of every file recovery.

Until the next post TheHexNinja says:

Seasons Greetings All
Prosperous New Year Awaits 
Drink and Be Merry

Tuesday, July 26, 2016

Hex Editors Phoaar

The Hex Editor

OK, so our basic tool on this journey is the humble hex editor. But all is not so simple.  There are a plethora of hex editors available. Basically we want to be able to highlight an area of interest, save.... copy...paste.. cut..repeat....

The basic features you will be using a lot of are
  • Search: bytes in hex, locate, count, index, export address
  • Goto: both absolute and relative address.
  • Select: nice if they are right click 'start', right click 'end'
  • Cut, Copy, Insert Paste, Overwrite Paste
  • Hex/Decimal: be able to switch between these easily
You will be doing these functions alot! So choose a hex editor that can do those functions easily or with shortcuts.

My favourite hex editors are (no affiliations or endorsements):


WinHex - Xways

Super fast, simple to use. All you really need for basic hex carving.
The basic personal licesne is ~$60 and well worth it.
For basic carving I really like the 'right click- beginning of block' , 'right click- end of block', Edit- Copy Block into new file - Walla.

WinHex Screenshot

Hex Workshop Screenshot
I like the coloured byte window....purrdy..., it is nice to help identify periodic patterns and you can pick up small changes in the data as you scroll through a file etc
License is $89.95
Copying and cutting blocks of data is a little cumbersome as you need to specify start address and either size or end address. Not a show stopper, but it does slow the Hex Ninja down when he has his flow on.

010 Editor
A bit more expensive but I like this one a lot for more complex operations and analysis
$129.95 or $49.95 for personal use
Has scripting capabilities and some nice file templates for parsing file structures

010 Editor Screenshot

Free Editors:

Notepad++ with the Hex Editor Plugin
Good if you like to keep the programming, hex editing all in one place.

Nice interface and has Mac version as well.

Although forensic tools have the ability to show the hex, the features are pretty limited (except for XWAYS -WinHex)

So.... What daily functions does Hex Ninja like to do in a hex editor?

The number one thing I do is seeing if a given file is intact, corrupted etc so by basically opening a file in a hex editor we get to see what it really like like and not what the file extension is labeling it as.

So open as many files as you can so you get to see the basic structure they have. If you first focus on JPG, PNG, MP4/MOV, AVI, DOC and PDF, you will be across most filetypes you want to recover, rebuild etc.
You will get so used to there structure and tags that you can recognise them in a stream of hex,

...there's way too much information to decode the Matrix. You get used to it, though. Your brain does the translating. I don't even see the code. All I see is blonde, brunette, redhead. Hey uh, you want a drink? -Cypher

For example the most common file the Hex Ninja sees is the common JPG or more correctly the JPEG File Interchange Format (JFIF).. The JPG is the file extension, the JFIF is the file container it is stored in. Lets hexinate a typical JPEG.

Hex View of JPEG

To do any basic carving we need to find the start of a file and the end of the file OR an embedded size so we can find the end. Let's take a quick look under the hood.

The basic structure in JFIF is a sequence of marker segments. Starting with FF followed by a byte defining the marker type. Depending on the marker there can be embedded data and nested marker segments. 

See for a basic overview or if you want to dig deeper.

The first 2 bytes 0xFFD8 indicate a 'Start Of Image' (SOI). 
If we just searched for the two bytes 0xFFD8 on a disk or 'unallocated space' we would produce to too many false hits. Generally the longer and more specific the search term the less false hits we will get, so two bytes is a little short so we will see what follows that we could use in a search term . 

The next two bytes 0xFFE0 indicate a 'JFIF APP0 marker segment'. which has embedded data such as the text 'JFIF'. While the 0xFFD8FFE0 is generally common across all cameras/phones I have seen a couple of cameras that didn't put the APP0 first but APP1 was first ie 0xFFD8FFE1 but that is rare so let's keep it simple.

Next we need to look for an embedded size or embedded file marker. 

Unfortunately there is no embedded size in the JFIF, We could technically decode the image as we carve to find the end but that it a bit more intense so lets start with finding the end. So we need to be looking for an end of file marker. In the JFIF specification it is End Of Image (EOI) 0xFFD9.... Really.. a two byte marker! That can lead to a lot of false positives. Why didn't they make it an 8 byte marker or even 4 or 6 bytes would be better! 

There are a couple of issues we should be aware of so we can try and avoid false positives in a search and carve: 

1. There can be embedded thumbnail/s inside the JFIF file that have the same SOI and EOI markers. Yep good thinking JPEG working group! We can generally avoid this by ignoring the EOI if it occurs too soon after the SOI. We can also carve out the thumbnails in a more thorough carve to be done in later blogs. 
2. If the end of the file has been overwritten we may not find the EOI marker until the end of another image. We can avoid this by limiting how far we search for the EOI after the SOI. 
3. The image data may be fragmented. That is, cluster size blocks of the data can be intermingled with  other files. Generally we do not know the location or sequence of the clusters. We will practise these in a later blog post.  

The marker 0xFFD9 should not occur in the file unless it is the EOI (of the main image or thumbnails), ie we should not find it in the compressed image data (OK JPEG working group, at least you thought of that).  

No back to our simple carve. We locate the 0xFFD9 indicating the end of the file.

JFIF EOI Marker 0xFFD9

So if we found what looked to be a JPEG in unallocated space or embedded in another file we can carve it out using the simple technique:
1. Search 0xffD8FFE0, mark the first byte as the start of the block.
2. Seacrh 0xFFD9, mark the last byte the end of the block.
3. Copy the block into a new file, save it with a .jpg extension and you will have a carved JPEG.

Until the next post TheHexNinja says:

Bamboo bends in wind
Ninja watches you alone