PDF Carver (pdfcarve)



HOME

SCRIPTS

PROJECTS

READING

CV

View Ron Dilley's profile on LinkedIn




PDF Carver (pdfcarve) can be found on GitHub.

You can clone the repository directly from github:

% git clone https:/github.com/rondilley/pdfcarve.git

What is pdfcarve?

PDFCarver is a command line PDF object decoder.  It is used to enumerate PDF objects and extract streams.

Once built, just pass the name of a suspect PDF as an argument to pdfcarve and you are in business!

syntax: pdfcarve [options] {filename} [{filename} ...]
 -d|--debug (0-9)     enable debugging info
 -h|--help            this info
 -v|--version         display version information
 -w|--write           write streams to disk


The output of a run is summarized below in gory detail.

Comment: PDF-1.6
Comment: вгПУ
Object: 15 0
Name: Linearized
Integer: 1
Name: L
Integer: 9168
Name: O
Integer: 28
Name: E
Integer: 3653
Name: N
Integer: 1
Name: T
Integer: 8850
Name: H
Integer: 457
Integer: 182
Object: 22 0
Name: DecodeParms
Name: Columns
Integer: 4
Name: Predictor
Integer: 12
Name: Filter
Name: FlateDecode
Name: ID
HEX String: 582A3DB73C0EAB408F658D8613D1CACA
00000000 58 2a 3d b7 3c 0e ab 40 8f 65 8d 86 13 d1 ca ca X*=.<..@.e......
HEX String: 989BC77CF0D27C438C2D81FDE0BA3812
00000000 98 9b c7 7c f0 d2 7c 43 8c 2d 81 fd e0 ba 38 12 ...|..|C.-....8.
Name: Index
Integer: 25
Integer: 22
Name: Info
ObjRef: 24 0
Name: Length
Integer: 55
Name: Prev
Integer: 8851
Name: Root
ObjRef: 16 0
Name: Size
Integer: 47
Name: Type
Name: XRef
Name: W
Integer: 1
Integer: 2
Integer: 1
Stream: 27 bytes
Name: Linearized
Name: L
Name: O
Name: E
Name: N
Name: T
Name: H
Name: DecodeParms
Name: Columns
Name: Predictor
Name: Filter
Name: FlateDecode
Name: ID
Name: Index
Name: Info
Name: Length
Name: Prev
Name: Root
Name: Size
Name: Type
Name: XRef
Name: W
<... snip ...>

Why use it?

I built this tool to help in analysis of a suspect PDF file.  I have used it many times to find the mechanism used by the bad guys to execute code or compromize a system using malicious payloads.  It is also handy for extracting files from PDF files.

What is in the works?

I am working on improving the processing of dictionary objects and adding some huristics including stream anomalies, object corruption, object linkage and reference errors and object/document revision disection.


Please report issues to webmaster@uberadmin.com

Last updated: 2016-03-28 @ 8:59pm