Introduction

This article has a similar goal as tmairi’s original post about the AccessData logical format, it is widely used on many challenges and barely any Linux tool exists to handle this format.

This article will focus on the findings of my research around the format, as the researchs and experimentations are pretty much the same as tmairi’s, generating several samples and comparing the differences. I just used HxD to study the hexdump of the files instead of bash script for convenience.

A bit of explanations to understand the tables : *hexOffset means the value taken from said address in the file. NB: All the addresses present in the AD logical images have to be offset by 512 Bytes to take into account the “ADSegmented” header that is 512 bytes of size. <4 defines a 4 bytes long value in little endian

AccessData headers

An AD1 logical image holds two headers that describes the structure and the basic metadatas of a container. One describing the segmentation, starting with the ADSEGMENTEDFILE signature and the AD logical header starting with the ADLOGICALIMAGE header.

Those two headers are decribed below.

ADsegmented header

Offset	Length	Value	Description
0x0	16	ADSEGMENTEDFILE\0	Segmented file signature - Ascii
0x18	<4		Segment index
0x1c	<4		Total segment number
0x22	<4		Fragmentation size
0x28	<4	512	Header size?

NB: When the file is segmented between several files, the data is stored sequentially between several .adX files. The actual data in those files doesn’t have a special headers or footer. For example, if the image was segmented between three files, the second one would have neither the ADLOGICALIMAGE nor the two footers ATTRGUID/LOCSGUID.

A lot of the space of this header is left unused and only a handful of addresses holds actual information.

AD logical image header

Offset	Length	Value	Description
0x200	16	ADLOGICALIMAGE\0\0	Logical image signature - ascii
0x210	<4		Image version, can only be 0x03 or 0x04
0x214	<4	0x01	Usage unknown
0x218	<4		Zlib chunk size, default seems to be 65536
0x21C	<8		Logical image metadata address, 0x00 for custom content image
0x224	<8		First data item address
0x22C	<4		Data source name length, 0x1D for custom content images (len(“Custom Content Image([Multi])”))
0x230	<4	AD\0\0	Some signature
0x234	<8		Data source name address, will generally be 0x5C
0x23C	<8		`ATTRGUID` signature address
0x24C	<8		`LOCSGUID` signature address
0x25C	*0x22C or 0x1D		Data source name, or “Custom Content Image([Multi])”

~~From there the data will differ a bit depending on the format, logical or custom content.~~

At that point i had prepared a paragraph on the Custom content image header, which holds information about every source of data just to realize afterward that it followed the very same structure as a folder, with much less metadata, hinted by the fact that FTK imager considers them as “Folder (placeholders)” So yeah the structure below there is valid but kinda redundant with the folder structure.

Custom Content Image Header

When the image is created through the custom content process, it will contain a 48 bytes header for each source of data (evidence item in FTKImager), the first one starting at offset 0x279 the breakdown below :

Offset	Length	Value	Description
0x00	<8		Index of next custom content header
0x08	<8		Index of the first file from the source
0x10	<8		Custom content metadata
0x18	<16		Probably padding, tests have shown to be all 0
0x28	<4	0x05	A custom content image is treated as a folder
0x2C	<4		Length of the data source name string
0x30	*0x2C		Data source name
+*0x2C	<8		Parent folder address, always 0x00 since custom content image headers are technically top level folders

While the actual structure is identical to regular folders, it has some different metadatas.

Files and folders

As pointed by tmairi, files and folders are sequentially placed within the AD1 image. It uses 3 addresses to keep track of the hierarchy of files and folders :

One pointing to the next item in the same hierarchical level
One pointing to the first item within a folder (exclusive to folders, files will have this address be 0x00, surprising i know)
One address pointing to the parent folder, it is placed directly after the current item name

File and folder strucure

Each file’s structure is made up of a 48 Bytes header, holding information about the file structure, data, and metadatas of the file. This structure is explained below :

Offset are taken from the file beginning. And all addresses contained in it must have be offset by due to the ADSegmented header.

Offset	Length	Description
0x00	<8	Next item on same hierarchy level
0x08	<8	Next item within the folder, 0x0 for files
0x10	<8	Address of first metadata structure
0x18	<8	Address of the zlib chunk metadata, 0x00 if the file is emptry of data or a folder
0x20	<8	Decompressed file size. 0x00 if there’s no data
0x28	<4	Type, 0x05 for folder, 0x00 for files
0x2C	<4	File name length
0x30	*0x2C	Filename
0x30+*0x2C	<8	Parent folder index, 0x0 if at the root

Zlib chunks

Each file is split in N zlib chunks of 65536 bytes of compressed Zlib data. The 0x18 address points toward holds the number of zlib chunks the file uses, Then the N next long (8 Bytes) values will hold the addresses of each zlib chunk in order. Then a last 8 bytes long value will point toward the beginning of the metadatas, marking the end of the zlib chunks. This last value is identical to the 0x10 value from the header.

All compression levels have zlib headers, even if the compression level is 0.

Directly after the data, we have a linked list of the metadata related to the file. They all follow the structure detailed in #Metadata structure. The next file/folder starts directly after the last metadata.

Metadata structure

This header is used at the start of a logical image, directly after the previous header for a simple one, or after every custom content header. There may be multiple of such headers following each other all linked by the first 8 bytes pointing to the next.

Offset	Length	Description
0x00	<8	0x0 or an offset to the next same type header
0x08	<4	Category
0x0C	<4	Keys
0x10	<4	Data length
0x14	*0x10	Data, a non \0 terminated string, not in little endian

Metadata breakdown

After a bunch of hours spend trying to understand the two middle values of the metadata, i discovered that pcbje made a documentation about the format, which is consistent with the image headers and file structure that i previously found. I also follow his classification of metadata and complete it with the values i found.

Category
- Key
  - Eventually its value if it holds special meaning
0x01 - Item content hash
- 0x5001 - MD5
- 0x5002 - SHA1
- 0x2001 - File type and posix perms, separated by a comma (eg : dir,rwx——w) Seems to be applied to APFS
- 0x10002 - Data source name
0x02 - Item type
- 0x01 - Samples have all show the value to be “2001D”, only there for custom content headers
- 0x02 - Item type as string
  - 0x31 - Regular file
  - 0x32 - Filesystem image? Also applied to logical image, would be “Folder (placeholder)” in ftkimager
  - 0x33 - Regular folder
  - 0x34 - Filesystem metadata?
  - 0x36 - FileSlack
  - 0x39 - Symbolic link
  - There’s probably other values that will be found later ig
0x03 - Item size
- 0x03 - File size as a string
- 0x04 - Some max filesize?
- 0x2002 - No idea what that is, haven’t seen a value other than “0”
  - 0x30 - The only value that was found there
- 0x2003 - Same as above
0x04 - Windows flags(?)
- 0x01 - Seems to be only “0"s
- 0x0D - True/False - Windows encrypted Flag
- 0x0E - True/False - Windows compressed flag
- 0x1E - True/False - May be the “allow indexation” flag, but it doesn’t change. Consistently True between samples
- 0x1002 - True/False - Windows hidden flag
- 0x1003 - True/False - No idea what that is, Consistently False between samples
- 0x1004 - True/False - Read only windows flag
- 0x1005 - True/False - Windows “Ready to archive” flag
0x05 - Timestamp (Stored in this format : 20240309T114333.669684 - YYYYMMDDTHHmmss.mmmµµµ
- 0x07 - Access timestamp
- 0x08 - Modified timestamp
- 0x09 - Change timestamp

As tmairi noted, the AD1 format holds two footer sections, ATTRGUID and LOCSGUID. Those sections are unexplored as of yet and will be covered in a future post if they hold valuable data. Probably hashing information or something.

Conclusions and further work

I hope this article allowed you to understand a bit more the format of the AccessData logical images, all those information should be enough to build a toolset to extract/mount data from AD1 images. It may be lacking information about certain metadatas category/keys, like when including Apple Filesystem in one of the Cyberdefenders samples but those will be added along the way.

The next step will be to make the tool, i’ll try to make it up in C for verification/extraction/mounting with FUSE. The github project will be opened when i have a first working version. AD1-Tools

I’ll try to make better visuals in the future, similar to the Invoke-Ir forensic posters, it will be easier to understand if we link the tables above to actual data.

AD1ventures Part 1 - An AD1 format (mostly) comprehensive breakdown