Authors: TJ O'Connor
Congratulations again! We have written quite a few tools in this chapter to investigate digital artifacts. Either by investigating the Windows Registry, the Recycle Bin, artifacts left inside metadata, or application-stored databases, we have added quite a few useful tools to our arsenal. Hopefully, you will be able to build upon each of the examples in this chapter to answer questions in your own future investigations.
1. Bright, P. (2011). Microsoft locks down Wi-Fi geolocation service after privacy concerns.
Ars Technica
. Retrieved from <
http://arstechnica.com/microsoft/news/2011/08/microsoft-locks-down-wi-fi-location-service-after-privacy-concerns.ars
>, August 2.
2. Geolocation API. (2009).
Google Code
. Retrieved from <
code.google.com/apis/gears/api_geolocation.html
>, May 29.
3. kosi2801. (2009).
Messing with the Skype 4.0 database
. BPI Inside. Retrieved from <
http://kosi2801.freepgs.com/2009/12/03/messing_with_the_skype_40_database.html
>, December 3.
4. Leyden, J. (2010). Greek police cuff Anonymous spokesman suspect.
The Register
. Retrieved from <
www.theregister.co.uk/2010/12/16/anonymous_arrests/
>, December 16.
5. Mozilla Firefox 3 History File Format. (2011).
Forensics Wiki
. Retrieved from <
www.forensicswiki.org/wiki/Mozilla_Firefox_3_History_File_Format
>, September 13.
6. Petrovski, G. (2011). mac-geolocation.nse. seclists.org. Retrieved from <
seclists.org/nmap-dev/2011/q2/att-735/mac-geolocation.nse
>.
7. “Prefect”. (2010). Anonymous releases very unanonymous press release.
Praetorian prefect
. Retrieved from <
praetorianprefect.com/archives/2010/12/anonymous-releases-very-unanonymous-press-release/
>, December 10.
8. Regan, B. (2006). Computer forensics: The new fingerprinting.
Popular mechanics
. Retrieved from <
http://www.popularmechanics.com/technology/how-to/computer-security/2672751
>, April 21.
9. Shapiro, A. (2007). Police use DNA to track suspects through family. National Public Radio (NPR). Retrieved from
http://www.npr.org/templates/story/story.php?storyId=17130501
, December 27.
10. Warden, P. (2011). iPhoneTracker. GitHub. Retrieved from
11. Well-known users of SQLite. (2012). SQLite Home Page. Retrieved from
http://www.sqlite.org/famous.html
, February 1.
Geo-Locate Internet Protocol (IP) Traffic
Discover Malicious DDoS Toolkits
Uncover Decoy Network Scans
Analyze Storm’s Fast-Flux and Conficker’s Domain Flux
Understand the TCP Sequence Prediction Attack
Foil Intrusion Detection Systems With Crafted Packets
Rather than being confined to a separate dimension, martial arts should be an extension of our way of living, of our philosophies, of the way we educate our children, of the job we devote so much of our time to, of the relationships we cultivate, and of the choices we make every day.
—Daniele Bolelli, Author, Fourth-Degree Black Belt in Kung Fu San Soo
On January 14, 2010, the United States learned of a coordinated, sophisticated, and prolonged computer attack that targeted Google, Adobe and over 30 Fortune 100 companies (
Binde, McRee, & O’Connor, 2011
). Dubbed Operation Aurora after a folder found in an infected machine, the attack used a novel exploit unseen before in the wild. Although Microsoft knew of the vulnerability exploited in the attack, they falsely assumed that nobody else knew of the vulnerability and therefore no mechanisms existed to detect such an attack.
To exploit their victims, the attackers initiated the attack by sending the victims an email with a link to a Taiwanese website with malicious JavaScript (
Binde, McRee, & O’Connor, 2011
). When users clicked on the link, they would download a piece of malware that connected back to a command-and-control server located in China (
Zetter, 2010
). From there, the attackers used their newly gained access to hunt for proprietary information stored on the exploited victims’ systems.
As obvious as this attack appears, it went undetected for several months and succeeded in penetrating the source code repositories of several Fortune 100 companies. Even a rudimentary piece of network visualization software could have identified this behavior. Why would a US-based Fortune 100 company have several users connected to a specific website in Taiwan and then again to a specific server located in China? A visual map that showed users connecting to both Taiwan and China with significant frequency could have allowed network administrators to investigate the attack sooner and stop it before the proprietary information was lost.
In the following sections, we will examine using Python to analyze different attacks in order to quickly parse through enormous volumes of disparate data points. Let’s begin the investigation by building a script to visually analyze network traffic, something the administrators at the victimized Fortune 100 companies could have used during Operation Aurora.
To begin with, we must how to correlate an Internet Protocol (IP) address to a physical location. To do this, we will rely on a freely available database from MaxMind, Inc. While MaxMind offers several precise commercial products, its open-source GeoLiteCity database available at
http://www.maxmind.com/app/geolitecity
offers us enough fidelity to correlate IP addresses to cities. Once the database has been downloaded, we need to decompress it and move it to a location such as /opt/GeoIP/Geo.dat.
analyst# wget
http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
--2012-03-17 09:02:20--
http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz
Resolving geolite.maxmind.com… 174.36.207.186
Connecting to geolite.maxmind.com|174.36.207.186|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 9866567 (9.4M) [text/plain]
Saving to: ‘GeoLiteCity.dat.gz’
100%[======================================================================================================================================================>] 9,866,567 724K/s in 15s k
2012-03-17 09:02:36 (664 KB/s) – ‘GeoLiteCity.dat.gz’ saved [9866567/9866567]
analyst#gunzip GeoLiteCity.dat.gz
analyst#mkdir /opt/GeoIP
analyst#mv GeoLiteCity.dat /opt/GeoIP/Geo.dat
With the GeoCityLite database, we can correlate an IP address to a state, postal code, country name, and general latitude and longitude coordinates. All of this will prove useful in analyzing IP traffic.
Jennifer Ennis produced a pure Python library to query the GeoLiteCity database. Her library can be downloaded from
http://code.google.com/p/pygeoip/
and installed prior to importing it into a Python script. Note that we will first instantiate a GeoIP class with the location of our uncompressed database. Next we will query the database for a specific record, specifying the IP address. This returns a record containing fields for city, region_name, postal_code, country_name, latitude and longitude, among other identifiable information.
import pygeoip
gi = pygeoip.GeoIP(‘/opt/GeoIP/Geo.dat’)
def printRecord(tgt):
rec = gi.record_by_name(tgt)
city = rec[‘city’]
region = rec[‘region_name’]
country = rec[‘country_name’]
long = rec[‘longitude’]
lat = rec[‘latitude’]
print ‘[∗] Target: ‘ + tgt + ’ Geo-located. ’
print ‘[+] ‘+str(city)+’, ‘+str(region)+’, ‘+str(country)
print ‘[+] Latitude: ‘+str(lat)+ ’, Longitude: ‘+ str(long)
tgt = ‘173.255.226.98’
printRecord(tgt)
Running the script, we see that it produces output showing the target IP’s physical location in Jersey City, NJ, US, with latitude 40.7245 and longitude −74.0621. Now that we are able to correlate an IP to a physical address, let’s begin writing our analysis script.
analyst# python printGeo.py
[∗] Target: 173.255.226.98 Geo-located.
[+] Jersey City, NJ, United States
[+] Latitude: 40.7245, Longitude: −74.0621
In the following chapter, we will primarily use the Scapy packet manipulation toolkit analyze and craft packets. In this section, we will use a separate toolkit, dpkt, to analyze packets. While Scapy offers tremendous capabilities, novice users often find the directions for installing it on Mac OS X and Windows extremely complicated. In contrast, dpkt is fairly simple: it can be downloaded from
http://code.google.com/p/dpkt/
and installed easily. Both offer similar capabilities, but it always proves useful to keep an arsenal of similar tools. After Dug Song initially created dpkt, Jon Oberheide added a lot of additional capabilities to parse different protocols, such as FTP, H.225, SCTP, BPG, and IPv6.
For this example, let’s assume we recorded a pcap network capture that we would like to analyze. Dpkt allows us to iterate through each individual packet in the capture and examine each protocol layer of the packet. Although we simply read a pre-captured PCAP in this example, we could just as easily analyze live traffic by using pypcap, available at
http://code.google.com/p/pypcap/
. To read a pcap file, we instantiate the file, create a pcap.reader class object and then pass that object to our function printPcap(). The object pcap contains an array of records containing the [timestamp, packet]. We can then break each packet down by into Ethernet and IP layers. Notice the lazy use of our exception handling here: because we may capture layer-2 frames that do not contain the IP layer, it’s possible to throw an exception. In this case, we use exception handling to catch the exception and continue on to the next packet. We use the socket library to resolve IP addresses stored in inet notation to a simple string. Finally, we print the source and destination to the screen for each individual packet.
import dpkt
import socket
def printPcap(pcap):
for (ts, buf) in pcap:
try:
eth = dpkt.ethernet.Ethernet(buf)
ip = eth.data
src = socket.inet_ntoa(ip.src)
dst = socket.inet_ntoa(ip.dst)
print ‘[+] Src: ‘ + src + ‘ --> Dst: ‘ + dst
except:
pass
def main():
f = open(‘geotest.pcap’)
pcap = dpkt.pcap.Reader(f)
printPcap(pcap)
if __name__ == ‘__main__’:
main()
Running the script, we see the source IP and destination IP address printed to the screen. While this provides us some level of analysis, let’s now correlate this to physical locations using our previous geo-location script.
analyst# python printDirection.py
[+] Src: 110.8.88.36 --> Dst: 188.39.7.79
[+] Src: 28.38.166.8 --> Dst: 21.133.59.224
[+] Src: 153.117.22.211 --> Dst: 138.88.201.132
[+] Src: 1.103.102.104 --> Dst: 5.246.3.148
[+] Src: 166.123.95.157 --> Dst: 219.173.149.77
[+] Src: 8.155.194.116 --> Dst: 215.60.119.128
[+] Src: 133.115.139.226 --> Dst: 137.153.2.196
[+] Src: 217.30.118.1 --> Dst: 63.77.163.212
[+] Src: 57.70.59.157 --> Dst: 89.233.181.180
Improving our script, let’s add an additional function called retGeoStr(), which returns a physical location for an IP address. For this, we will simply resolve the city and three-digit country code and print these to the screen. If the function raises an exception, we will return a message indicating the address is unregistered. This handles instances of addresses not in the GeoLiteCity database or private IP addresses, such as 192.168.1.3 in our case.
import dpkt, socket, pygeoip, optparse
gi = pygeoip.GeoIP(“/opt/GeoIP/Geo.dat”)
def retGeoStr(ip):
try:
rec = gi.record_by_name(ip)
city=rec[’city’]
country=rec[’country_code3’]
if (city!=’’):
geoLoc= city+”, “+country
else:
geoLoc=country
return geoLoc
except:
return “Unregistered”
Adding the retGeoStr function to our original script, we now have a pretty powerful packet analysis toolkit that allows us to see the physical destinations of our packets.
import dpkt
import socket
import pygeoip
import optparse
gi = pygeoip.GeoIP(‘/opt/GeoIP/Geo.dat’)
def retGeoStr(ip):
try:
rec = gi.record_by_name(ip)
city = rec[’city’]
country = rec[’country_code3’]
if city != ’’:
geoLoc = city + ‘, ‘ + country
else:
geoLoc = country
return geoLoc
except Exception, e:
return ‘Unregistered’
def printPcap(pcap):
for (ts, buf) in pcap:
try:
eth = dpkt.ethernet.Ethernet(buf)
ip = eth.data
src = socket.inet_ntoa(ip.src)
dst = socket.inet_ntoa(ip.dst)
print ‘[+] Src: ‘ + src + ‘ --> Dst: ‘ + dst
print ‘[+] Src: ‘ + retGeoStr(src) + ‘ --> Dst: ‘ \
+ retGeoStr(dst)
except:
pass
def main():
parser = optparse.OptionParser(‘usage%prog -p
parser.add_option(‘-p’, dest=’pcapFile’, type=’string’,\
help=’specify pcap filename’)
(options, args) = parser.parse_args()
if options.pcapFile == None:
print parser.usage
exit(0)
pcapFile = options.pcapFile
f = open(pcapFile)
pcap = dpkt.pcap.Reader(f)
printPcap(pcap)
if __name__ == ‘__main__’:
main()
Running our script, we see several of our packets headed to Korea, London, Japan, and even Australia. This provides us quite a powerful analysis tool. However, Google Earth may prove a better way of visualizing this same information.
analyst# python geoPrint.py -p geotest.pcap
[+] Src: 110.8.88.36 --> Dst: 188.39.7.79
[+] Src: KOR --> Dst: London, GBR
[+] Src: 28.38.166.8 --> Dst: 21.133.59.224
[+] Src: Columbus, USA --> Dst: Columbus, USA
[+] Src: 153.117.22.211 --> Dst: 138.88.201.132
[+] Src: Wichita, USA --> Dst: Hollywood, USA
[+] Src: 1.103.102.104 --> Dst: 5.246.3.148
[+] Src: KOR --> Dst: Unregistered
[+] Src: 166.123.95.157 --> Dst: 219.173.149.77
[+] Src: Washington, USA --> Dst: Kawabe, JPN
[+] Src: 8.155.194.116 --> Dst: 215.60.119.128
[+] Src: USA --> Dst: Columbus, USA
[+] Src: 133.115.139.226 --> Dst: 137.153.2.196
[+] Src: JPN --> Dst: Tokyo, JPN
[+] Src: 217.30.118.1 --> Dst: 63.77.163.212
[+] Src: Edinburgh, GBR --> Dst: USA
[+] Src: 57.70.59.157 --> Dst: 89.233.181.180
[+] Src: Endeavour Hills, AUS --> Dst: Prague, CZE
Google Earth provides a virtual globe, map, and geographical information, shown on a proprietary viewer. Although proprietary, Google Earth can easily integrate custom feeds or tracks into the globe. Creating a text file with the extension KML allows a user to integrate various place marks into Google Earth. KML files contain a specific XML structure, as show in the following example. Here, we show how to plot two specific place marks on the map with a name and specific coordinates. As we already have the IP address, latitude and longitude for our points, this should prove easy to integrate into our existing script to produce a KML file.
Let’s build a quick function, retKML(), that takes an IP as input and returns the specific KML structure for a place mark. Notice that we are first resolving the IP address to a latitude and longitude using pygeoip; we can then build our KML for a place mark. If we encounter an exception, such as “location not found,” we return an empty string.
def retKML(ip):
rec = gi.record_by_name(ip)
try:
longitude = rec[’longitude’]
latitude = rec[’latitude’]
kml = (
‘
‘
‘
‘
‘\n’
‘\n’
)%(ip,longitude, latitude)
return kml
except Exception, e:
return ’’
Integrating the function into our original script, we now also add the specific KML header and footer required. For each packet, we produce KML place marks for the source IP and destination IP and plot them on our globe. This produces a beautiful visualization of network traffic. Think of all the ways of expanding this that could prove useful for an organization’s specific purpose. You may wish to use different icons for the types of traffic, specified by the source and destination TCP ports (for example 80 web or 25 mail). Take a look at the Google KML documentation available from
https://developers.google.com/kml/documentation/
and think about all the ways of expanding our script for yourorganization’s visualization purposes.
import dpkt
import socket
import pygeoip
import optparse
gi = pygeoip.GeoIP(‘/opt/GeoIP/Geo.dat’)
def retKML(ip):
rec = gi.record_by_name(ip)
try:
longitude = rec[’longitude’]
latitude = rec[’latitude’]
kml = (
‘
‘
‘
‘
‘\n’
‘\n’
)%(ip,longitude, latitude)
return kml
except:
return ’’
def plotIPs(pcap):
kmlPts = ’’
for (ts, buf) in pcap:
try:
eth = dpkt.ethernet.Ethernet(buf)
ip = eth.data
src = socket.inet_ntoa(ip.src)
srcKML = retKML(src)
dst = socket.inet_ntoa(ip.dst)
dstKML = retKML(dst)
kmlPts = kmlPts + srcKML + dstKML
except:
pass
return kmlPts
def main():
parser = optparse.OptionParser(‘usage%prog -p
parser.add_option(‘-p’, dest=’pcapFile’, type=’string’,\
help=’specify pcap filename’)
(options, args) = parser.parse_args()
if options.pcapFile == None:
print parser.usage
exit(0)
pcapFile = options.pcapFile
f = open(pcapFile)
pcap = dpkt.pcap.Reader(f)
kmlheader = ‘\
\n
kmlfooter = ‘\n\n’
kmldoc=kmlheader+plotIPs(pcap)+kmlfooter
print kmldoc
if __name__ == ‘__main__’:
main()
Running our script, we redirect output to a text file with a .kml extension. Opening this file with Google Earth, we see a visual depiction our packet destinations. In the next section, we will use our analysis skills to detect a worldwide threat posed by the hacker group Anonymous.