Plists, XML and XPATH – A Series

reblogged from 

 

  1. http://mobileforensics.wordpress.com/2012/11/12/plists-xml-and-xpath-a-series/
  2. http://mobileforensics.wordpress.com/2012/11/19/plists-xml-and-xpath-a-series-pt-2/
  3. http://mobileforensics.wordpress.com/2012/11/30/plists-xml-and-xpath-a-series-pt-3/
  4. http://mobileforensics.wordpress.com/2012/12/05/plists-xml-and-xpath-a-series-pt-4/

 

I’ve been doing some research into the various data storage methods on smart phones and found myself getting engrossed in plists. Though I’ve mentioned them in classes and we’ve talked about how they were constructed and various roadblocks to extracting information from them ,I’d never really done an in-depth module or exercise on them.

Well, I hope with this series to change that omission. At the end of the series, I’ll provide a link to all the posts gathered into one paper. Now without further ado, here is the first of the series on Plists, XML and XPATH.

What is XML?

 

Although the term XML is thrown around in forensic classes and seen as an option in for analysis output in many of the major forensic tools, how many examiners understand how XML is constructed and the rules that apply to it? As it turns out, XML isn’t all that hard to understand and having a grasp on how it’s used to store data is useful to understanding Plists and other files stored on digital devices – such as current.gpx on Garmin GPS units – which use the XML format.
Let’s look at the building blocks that make up XML and some of the rules that govern them.

Definition of XML

First, let’s define the term XML. XML stands for Extensible Markup Language. This language is an official recommendation of the World Wide Web Consortium (W3C). XML is a metalanguage that allows for the creation and formatting of documents; it is in common use on the Internet and the default of many office productivity suites including Microsoft Office and Apple iWork.

XML Terminology

Next, let’s discuss some terminology used in XML so we can understand when these terms are used in later discussions when we are looking at and reading plists. This is by no means all the terminology that is used in XML; rather these terms are covered here in order to give an examiner a working knowledge of items that may be encountered when working on XML formatted evidence.

Elements – XML is made up of one or more elements. Elements consist of two tags – an opening tag, which is the name of the tag delimited by a less-than sign (“”), and a closing tag, which is the same as the opening tag except that there is a forward slash (“/“) before the element name. An example of an element is MAC Address . The text inside the two tags is considered part of that element and is processed per the element’s rules.
Attribute – An element can have an attribute that serves to modify or refine the default meaning of the elements. Attributes can also be applied to empty elements which are used to provide non-texual content or give additional information to the application that is parsing through the XML. Here is an example of a picture element with the src attribute: . This could also be displayed as a short hand because the element is empty.
Declaration – Most XML documents begin by declaring information about themselves for a processing program as in the following example: . This would tell a parsing program that the XML document uses the Version 1.0 format and optimized for UTF-16 unicode encoding.
Document Type Definition (DTD) – This is an external file that specifies the rules for how all the elements, attributes and other data are defined and related. Below is Apple’s DTD for plists (also located athttp://www.apple.com/DTDs/PropertyList-1.0.dtd)

Apple’s Plist DTD

 

Root Element – This is the outermost element to which the DTD applies and is usually the start and end points of the document. An example for a plist would be .
CDATA – This stands for “character data.” Anything that occurs after a CDATA section is not to be marked up and is treated as plain text.
PCDATA – This stands for “parsed character data” and means that any character data that is not an element can appear between the tags. In the above Apple DTD, means that any characters such as “WebbookmarkType” can show up between the key element tags but not another tag such as .

 

Here is the second installment of the series that came out of my research into Plists. I should have placed a references section at the end of the first post – I apologize for not including that. It will appear at the end of this post and all subsequent ones as well. Without further ado, here is part two in which we continue our brief overview of XML.

Special XML Markups and Syntax Rules

When discussing XML basics we should also cover some special markup constructs that you may encounter.

<?xml…?> – As we have seen in the previous section, this is the XML declaration and can take attributes such as encoding or version

<!-…-> – This construct is for used for comments and anything occurring inside this construct is ignored.

– We have seen this before in DTD. This allows for the specification of the DTD. It takes two forms in general –  SYSTEM, which specifies the URI of a DTD for private use as in http://www.mygreatsite.com/dtd/mydoc.dtd”>, or PUBLIC. PUBLIC is used when the DTD has been publicized for widespread usage. We have seen a use of thePUBLIC specification in the Apple DTD above.

Finally we will conclude looking at XML with the rules for well formed XML

  • All element attributes must have quotation marks
  • All elements must have a closing tag
  • XML tags are case sensitive
  • XML elements must be properly nested
               Example incorrect - <b><i>This text is bold and italic</b></i>
               Example correct - <b><i>This text is bold and italic</i></b>
  • XML Documents must have a root element (we will cover this in the next section)
  • White space is preserved in XML
  • XML stores a new line as a line feed

Tree Structure

XML documents must have a root element. The root element is considered the “parent” of all other elements. The elements form a tree that starts at the root element and branches out to the lowest level of the tree.

All the elements in the XML documents can have sub-elements

<root>
    <child>
       <subchild>.....</subchild>
    </child>
</root>

Let’s look at an example

Example XML Tree

Figure One: Example XML Tree

In the previous example, our root element is <bookstore>. Any <book> elements reside inside of the <bookstore> element. Looking at our <book> element we see that it has four children – <title>, <author>, <year> and <price>.

Notice in the screen capture that the root element (<bookstore> is called the “parent” as we stated before, the next element <book> is called the child and the children elements of <book> are called “siblings”. These concepts are important, as they will be discussed in our short introduction to XPATH – a language that can be used to find information in an XML document.

I hope this installment was useful to you in your forensic endeavors and research. Check back next week for the third installment.

References

Apple Inc. (2012) Mac OS X Reference Library, Manual Page for PLIST(5), [Online], Available:https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man5/plist.5.html[October 23 2012]

Caithness, Alex (2010). Property Lists in Digital Forensics, Available: http://www.cclgroupltd.com/images/property%20lists%20in%20digital%20forensics%20new.pdf, CCL Solutions Group Ltd: Stratford upon-Avon, UK

Eckstein, Robert & Casabianca, Michel(2001). XML Pocket Reference (2nd edition). Sebastopol, CA:O’Reilly and Associates Inc.

Erack Network(2012). Xpath – predicates[Online}, Available:  http://www.tizag.com/xmlTutorial/xpathpredicate.php, [November 1, 2012]

Wikimedia Foundation(2012) Wikipedia: XML[Online], Available: http://en.wikipedia.org/wiki/XML, [October 30, 2012]

World Wide Web Consortium(2012) Extensible Markup Language Tutorial (XML)[Online], Available:http://www.w3schools.com/xml/ [October 24, 2012]

World Wide Web Consortium (2012) Extensible Markup Language (XML) [Online], Available: http://www.w3.org/XML/[ October 24,  2012]

World Wide Web Consortium(2012) XPATH Tutorial, [Online], Available:http://www.w3schools.com/xpath/default.asp/ [October 28, 2012]

 

Having now done a cursory overview of XML, I’d like to turn my attention to property lists or plists as they are commonly known.  Plists according to wikipedia (http://en.wikipedia.org/wiki/Property_list) are files that are used to store serialized object – read data. Very often they are used to store application and user settings. They are a rich source of forensic data that is, at least in my opinion – little understood and under-exploited.

I will be concentrating on binary plists as this is the most common format encountered in iOS and will be using as a launch point the excellent paper ” Property Lists in Digital Forensics “by CCL Forensics’ Alex Caithness (you can find a link to the paper at the end of this post). My aim in the next few posts is to illuminate Caithness’ work and break it open in the hopes that it will be understandable to a wider audience.

I have to confess the motivation behind this was slightly selfish. I myself had some trouble following the work and once I had “cracked the code” so to speak, thought it might be useful for others to benefit from a more in-depth discussion of Alex’s work.

So without further fanfare – here is part three , that which concerns binary plists.

Binary Plists

 

Caithness points out in “Property Lists in Digital Forensics” that the binary plist is constructed of four distinct parts (Caithness, p 4). Further more he describes them in the order that he presents as the way to read the file for interpretation. I summarize his findings below.

The file starts out with a recognizable header. This header comprises the first eight bytes of the file and is the ASCII String “bplist00” (x62x70x6Cx69x73x74x30x30) – which is the file format and the version.

The trailer of the file consists of the final 32 bytes. It contains data that is needed to read the file properly. The trailer will be discussed in detail later as we traverse a binary plist and read it.

The offset table – which will also be discussed later – is a table that contains the offsets – or locations within the file, which point to objects in the object table – meaning the data of the file.

The final part of the file as was mentioned above is the object table. This is the “meat” of the file, which contains the binary encoded data of each object or element in the plist. Like the trailer and offset table we will deal with the unique features of objects in a following section.

We will be using the bookmarks.plist file that is located here .

Finding the trailer on an existing plist is relatively straightforward. Since we know that the trailer is 32 bytes in length (Caithness p.4)- we can sweep the bytes from the end of the file until we reach a count of 32.

 Location of Binary Plist Trailer

Location of Binary Plist Trailer

Now that I have located the trailer I like to copy and paste the selection into a new hex file so I can refer to its offsets in a separate window and do not have to keep moving back and forth in the file as is seen in the next image.

Binary Plist Trailer in separate file

Binary Plist Trailer in separate file

We are now set to parse the trailer to locate its key elements and find the location of the offset table of the plist which will enable us to parse the the objects contained in the rest of the file.

The below table is a key to parsing out the file – this has been adapted from Alex Caithness’ table found on page 4 of “Property Lists in Digital Forensics”.

 Interpreted Data   Offset in Table   Length of Data   Data Type
Size of integers for offset table(bytes)            6             1 8 bit unsigned integer
Size of collection object reference integers(bytes)            7             1 8 bit unsigned integer
Number of Objects in file            8             8 64-bit unsigned integer (big endian)
Beginning object index            16             8 64-bit unsigned integer (big endian)
Offset location of object offset table            24             8 64-bit unsigned     integer (big endian)

Binary plist trailer data

Now we will begin figuring out the parts of the trailer to read the rest of the file. I recommend recording the values on a sheet or in a file for easy reference.

  • Read the offset to the offset table. Out table above tells us the location of the object offset table occurs at the 24th offset in our trailer and runs for a length of eight bytes. Using our trailer that we copied out of the binary plist file (again this is supplied – link -) we can see that from offset 24 and running eight bytes we get the value of x02x89. This is decimal 649.
Offset to Offset Table

Offset to Offset Table

  • Calculate the length of the offset table.  The length of the table is obtained by taking the “Size of integers” value located at offset six of the trailer and the number of the objects in the file located at offset eight in the file and running for eight bytes and multiplying the decimal values of these bytes to arrive at the length of the table.
Length of the Offset Table

Length of the Offset Table

  • Find the offset table and block it off. Going to offset 649 – or x0289 – sweep from that offset for 58 bytes. Then copy those values into a separate hex file for reading.
Location of Offset Table

Location of Offset Table

Offset table

Offset table

Our next step entails reading the offset table to find the location of our objects or data. We know that the offset table is a zero based index of the objects in the file, ie. The first object is the 0th entry on the offset table, and the size of the offsets (encoded big endian) from the value at offset six of the trailer(x02). Now we can look at the offset table and find the location of the first object in the object table. This will occur immediately after the file header(“bplist00”).

We see from the below that this is indeed the case as the offset table indicates the first object occurs at x00x08.

Size of integers, location of first object and first object data type

Size of integers, location of first object and first object data type

The offset table will be read again and again as we go through the objects of the file. Now we must turn our attention to interpreting the objects that are found at each offset that is specified in the offset table.

We have just found our first object at offset 8 in the bplist. The first byte of the object is known as a type-descriptor byte (Caithness p 5) and will hold the clue on how to read and interpret the object.

Reading and interpreting this first object will start us off on the next installment of our Plist, XML and XPATH Series. Until then, I hope that this series is proving informative in your forensic endeavors. I look forward to seeing you next week.

References

Apple Inc. (2012) Mac OS X Reference Library, Manual Page for PLIST(5), [Online], Available:https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man5/plist.5.html [October 23 2012]

Caithness, Alex (2010). Property Lists in Digital Forensics, Available:  http://www.cclgroupltd.com/images/property%20lists%20in%20digital%20forensics%20new.pdf, CCL Solutions Group Ltd: Stratford upon-Avon, UK

Eckstein, Robert & Casabianca, Michel(2001). XML Pocket Reference (2nd edition). Sebastopol, CA:O’Reilly and Associates Inc.

Erack Network(2012). Xpath – predicates[Online}, Available:  http://www.tizag.com/xmlTutorial/xpathpredicate.php, [November 1, 2012]

Wikimedia Foundation(2012) Wikipedia: XML[Online], Available: http://en.wikipedia.org/wiki/XML, [October 30, 2012]

World Wide Web Consortium(2012) Extensible Markup Language Tutorial (XML)[Online], Available: http://www.w3schools.com/xml/ [October 24, 2012]

World Wide Web Consortium (2012) Extensible Markup Language (XML) [Online], Available: http://www.w3.org/XML/[ October 24,  2012]

World Wide Web Consortium(2012) XPATH Tutorial, [Online], Available:http://www.w3schools.com/xpath/default.asp/ [October 28, 2012]

 

Greetings from Veenendaal NL! While most of my colleagues from the Amsterdam Police are enjoying SinterKlaas, I thought I would post the next installment in my series on Plists, XML and XPATH.

In this installment we continue to break open the reverse engineering of Alex Caithness’ paper “Property Lists in Digital Forensics”. In our last installment we ended just before looking at the type descriptor byte of our first object.

type_descriptor

Data Type Descriptors

We see that the first byte of our first object is xD4. Converting this to binary we get the value 1101 0100, which our table tells us is a dictionary. Remember that a dictionary is a collection of key-value pairs. Our table tells us that the second nibble of our byte(4) reveals that the amount of object reference pairs that are present in the dictionary. However, since they are pairs we have to double the amount to get the both the key and the value. The total number of object references in this dictionary are therefore 8. Looking at our bplist file we see that this is indeed true.

Dictionary collection object references

Dictionary collection object references

Since the beginning dictionary for our 0th entry we see that the first object reference after the dictionary is x01. This refers to the index in the offset table – since the dictionary was found from the 0th index, the first object is found at index #1. The value at the first position is x00x11 or decimal 17.

Offset to first object reference of dictionary

Offset to first object reference of dictionary

Going to offset 17 we see that we have x5f which converted to binary is 0101 1111. Our table indicates that this is a string and the left nibble of the byte “F” tells us that an integer byte follows to give us the length of the string. That byte is x10 which is 0001 0000 – the data type for an integer. Since 2^0  = 1(remember that length of this data type is 2^nnnn), the length of the data will be read in the next byte – x0F or 15. Sweeping fifteen bytes after this byte we see that we have the string of  “WebBookmarkType”.

ASCII representation of first object reference

ASCII representation of first object reference

Let’s verify our findings another way. Let’s look at the binary plist decoded into XML to see if our work with the hex is correct. Here we see that the first object is indeed a dictionary and that the first object of the dictionary is a key called “WebBookmarkType”. So far so good!

<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">
  <dict>  -> Our first object at the 0th position of the offset table.
    <key>WebBookmarkType</key>  -> The first object of the dictionary collection. This was found at the first 
                                   position of the offset table as indicated by the dictionary object 
                                   reference.

Moving on, the dictionary object reference points to the second index of the offset table.  The value here is x00x23. This converts to decimal 35. We find another string – x5f – at this offset. Reading the next two bytes – x10 for the integer byte and x0F for the length (again 15) – we can sweep for our string value, which is in this instance “WebBookmarkUUID”

Second object reference

Second object reference

ASCII Representation of second object reference

ASCII Representation of second object reference

Let’s again check our converted bplist to see if we got it right.

<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">
  <dict>
    <key>WebBookmarkType</key>  
    <string>WebBookmarkTypeList</string>
    <key>WebBookmarkUUID</key>

 

Hey wait a second, <string>WebBookmarkTypeList</string> follows “WebBookmarkType”!

No, you haven’t parsed the hex incorrectly. The “value” of the key-value pairs follows in the hex after all the keys have been identified in the order that the keys are identified in the dict object pairs. Don’t believe me? Ok you Philistines, check out the fifth index of the offset table – remember that its zero based so count to five starting at zero. Did you find x00x57 (decimal 87)? Good. Now jump back to the bplist and find offset 87 – you should see a x5f (by now you should guess that its a string). Its followed by integer byte x10 and then the length by x13 which converts to decimal 19 for the length of the string in bytes. Now sweep 19 bytes. Did you find “WebBookmarkTypeList”?

Offset to fifth object reference

Offset to fifth object reference

ASCII representation of fifth object reference

ASCII representation of fifth object reference

Recalling the XML conversion of the bplist is “WebBookmarkTypeList” the string value of the key “WebBookmarkType”? You betcha it is!

<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">
  <dict>
    <key>WebBookmarkType</key>  
    <string>WebBookmarkTypeList</string>
    <key>WebBookmarkUUID</key>

 

This pattern repeats itself for each of the keys value pairs in the dictionary until it reaches the fourth key. Remember a key can contain as the data type of the element following another collection. This is indeed what we find as the fourth object of the topmost dictionary.

Our examination of the fourth object of the topmost dictionary will start us off on the next installment of our series. Until then, I wish you all the best in your forensic endeavors and a very good SinterKlaas!

References

Apple Inc. (2012) Mac OS X Reference Library, Manual Page for PLIST(5), [Online], Available:https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man5/plist.5.html [October 23 2012]

Caithness, Alex (2010). Property Lists in Digital Forensics, Available:  http://www.cclgroupltd.com/images/property%20lists%20in%20digital%20forensics%20new.pdf, CCL Solutions Group Ltd: Stratford upon-Avon, UK

Eckstein, Robert & Casabianca, Michel(2001). XML Pocket Reference (2nd edition). Sebastopol, CA:O’Reilly and Associates Inc.

Erack Network(2012). Xpath – predicates[Online}, Available:  http://www.tizag.com/xmlTutorial/xpathpredicate.php, [November 1, 2012]

Wikimedia Foundation(2012) Wikipedia: XML[Online], Available: http://en.wikipedia.org/wiki/XML, [October 30, 2012]

World Wide Web Consortium(2012) Extensible Markup Language Tutorial (XML)[Online], Available: http://www.w3schools.com/xml/ [October 24, 2012]

World Wide Web Consortium (2012) Extensible Markup Language (XML) [Online], Available: http://www.w3.org/XML/[ October 24,  2012]

World Wide Web Consortium(2012) XPATH Tutorial, [Online], Available:http://www.w3schools.com/xpath/default.asp/ [October 28, 2012]

Leave a comment