Tài liệu Google hacking for penetration tester - part 21

.PDF

349

tranphuong5053 Báo vi phạm

Tải xuống 62

Mô tả:

452_Google_2e_05.qxd 10/5/07 12:46 PM Page 201 Google’s Part in an Information Collection Framework • Chapter 5 Figure 5.20 The LinkedIn Profile of the Author of a Government Document Can this process of grabbing documents and analyzing them be automated? Of course! As a start we can build a scraper that will find the URLs of Office documents (.doc, .ppt, .xls, .pps). We then need to download the document and push it through the meta information parser. Finally, we can extract the interesting bits and do some post processing on it. We already have a scraper (see the previous section) and thus we just need something that will extract the meta information from the file.Thomas Springer at ServerSniff.net was kind enough to provide me with the source of his document information script. After some slight changes it looks like this: #!/usr/bin/perl # File-analyzer 0.1, 07/08/2007, thomas springer # stripped-down version # slightly modified by roelof temmingh @ paterva.com # this code is public domain - use at own risk # this code is using phil harveys ExifTool - THANK YOU, PHIL!!!! # http://www.ebv4linux.de/images/articles/Phil1.jpg 201 452_Google_2e_05.qxd 202 10/5/07 12:46 PM Page 202 Chapter 5 • Google’s Part in an Information Collection Framework use strict; use Image::ExifTool; #passed parameter is a URL my ($url)=@ARGV; # get file and make a nice filename my $file=get_page($url); my $time=time; my $frand=rand(10000); my $fname="/tmp/".$time.$frand; # write stuff to a file open(FL, ">$fname"); print FL $file; close(FL); # Get EXIF-INFO my $exifTool=new Image::ExifTool; $exifTool->Options(FastScan => '1'); $exifTool->Options(Binary => '1'); $exifTool->Options(Unknown => '2'); $exifTool->Options(IgnoreMinorErrors => '1'); my $info = $exifTool->ImageInfo($fname); # feed standard info into a hash # delete tempfile unlink ("$fname"); my @names; print "Author:".$$info{"Author"}."\n"; print "LastSaved:".$$info{"LastSavedBy"}."\n"; print "Creator:".$$info{"creator"}."\n"; print "Company:".$$info{"Company"}."\n"; print "Email:".$$info{"AuthorEmail"}."\n"; exit; #comment to see more fields foreach (keys %$info){ print "$_ = $$info{$_}\n"; } 452_Google_2e_05.qxd 10/5/07 12:46 PM Page 203 Google’s Part in an Information Collection Framework • Chapter 5 sub get_page{ my ($url)=@_; #use curl to get it - you might want change this # 25 second timeout - also modify as you see fit my $res=`curl -s -m 25 $url`; return $res; } Save this script as docinfo.pl.You will notice that you’ll need some PERL libraries to use this, specifically the Image::ExifTool library, which is used to get the meta data from the files. The script uses curl to download the pages from the server, so you’ll need that as well. Curl is set to a 25-second timeout. On a slow link you might want to increase that. Let’s see how this script works: $ perl docinfo.pl http://www.elsevier.com/framework_support/permreq.doc Author:Catherine Nielsen LastSaved:Administrator Creator: Company:Elsevier Science Email: The scripts looks for five fields in a document: Author, LastedSavedBy, Creator, Company, and AuthorEmail.There are many other fields that might be of interest (like the software used to create the document). On it’s own this script is only mildly interesting, but it really starts to become powerful when combining it with a scraper and doing some post processing on the results. Let’s modify the existing scraper a bit to look like this: #!/usr/bin/perl use strict; my ($domain,$num)=@ARGV; my @types=("doc","xls","ppt","pps"); my $result; foreach my $type (@types){ $result=`curl -s -A moo "http://www.google.com/search?q=filetype:$type+site:$domain&hl=en& num=$num&filter=0"`; parse($result); } sub parse { ($result)=@_; 203 452_Google_2e_05.qxd 204 10/5/07 12:46 PM Page 204 Chapter 5 • Google’s Part in an Information Collection Framework my $start; my $end; my $token="

"; my $count=1; while (1){ $start=index($result,$token,$start); $end=index($result,$token,$start+1); if ($start == -1 || $end == -1 || $start == $end){ last; } my $snippet=substr($result,$start,$end-$start); my ($pos,$url) = cutter("","",$pos,$snippet); my ($pos,$summary) = cutter("","
",$pos,$snippet); # remove and $heading=cleanB($heading); $url=cleanB($url); $summary=cleanB($summary); print $url."\n"; $start=$end; $count++; } } sub cutter{ my ($starttok,$endtok,$where,$str)=@_; my $startcut=index($str,$starttok,$where)+length($starttok); my $endcut=index($str,$endtok,$startcut+1); my $returner=substr($str,$startcut,$endcut-$startcut); my @res; push @res,$endcut; push @res,$returner; return @res; } sub cleanB{ 452_Google_2e_05.qxd 10/5/07 12:46 PM Page 205 Google’s Part in an Information Collection Framework • Chapter 5 my ($str)=@_; $str=~s///g; $str=~s/<\/b>//g; return $str; } Save this script as scraper.pl.The scraper takes a domain and number as parameters.The number is the number of results to return, but multiple page support is not included in the code. However, it’s child’s play to modify the script to scrape multiple pages from Google. Note that the scraper has been modified to look for some common Microsoft Office formats and will loop through them with a site:domain_parameter filetype:XX search term. Now all that is needed is something that will put everything together and do some post processing on the results.The code could look like this: #!/bin/perl use strict; my ($domain,$num)=@ARGV; my %ALLEMAIL=(); my %ALLNAMES=(); my %ALLUNAME=(); my %ALLCOMP=(); my $scraper="scrape.pl"; my $docinfo="docinfo.pl"; print "Scraping...please wait...\n"; my @all_urls=`perl $scraper $domain $num`; if ($#all_urls == -1 ){ print "Sorry - no results!\n"; exit; } my $count=0; foreach my $url (@all_urls){ print "$count / $#all_urls : Fetching $url"; my @meta=`perl $docinfo $url`; foreach my $item (@meta){ process($item); } $count++; } #show results 205 452_Google_2e_05.qxd 206 10/5/07 12:46 PM Page 206 Chapter 5 • Google’s Part in an Information Collection Framework print "\nEmails:\n-------------\n"; foreach my $item (keys %ALLEMAIL){ print "$ALLEMAIL{$item}:\t$item"; } print "\nNames (Person):\n-------------\n"; foreach my $item (keys %ALLNAMES){ print "$ALLNAMES{$item}:\t$item"; } print "\nUsernames:\n-------------\n"; foreach my $item (keys %ALLUNAME){ print "$ALLUNAME{$item}:\t$item"; } print "\nCompanies:\n-------------\n"; foreach my $item (keys %ALLCOMP){ print "$ALLCOMP{$item}:\t$item"; } sub process { my ($passed)=@_; my ($type,$value)=split(/:/,$passed); $value=~tr/A-Z/a-z/; if (length($value)<=1) {return;} if ($value =~ /[a-zA-Z0-9]/){ if ($type eq "Company"){$ALLCOMP{$value}++;} else { if (index($value,"\@")>2){$ALLEMAIL{$value}++; } elsif (index($value," ")>0){$ALLNAMES{$value}++; } else{$ALLUNAME{$value}++; } } } } This script first kicks off scraper.pl with domain and the number of results that was passed to it as parameters. It captures the output (a list of URLs) of the process in an array, and then runs the docinfo.pl script against every URL.The output of this script is then sent for further processing where some basic checking is done to see if it is the company name, an e-mail address, a user name, or a person’s name.These are stored in separate hash tables for later use. When everything is done, the script displays each collected piece of information and the number of times it occurred across all pages. Does it actually work? Have a look: 452_Google_2e_05.qxd 10/5/07 12:46 PM Page 207 Google’s Part in an Information Collection Framework • Chapter 5 # perl combined.pl xxx.gov 10 Scraping...please wait... 0 / 35 : Fetching http://www.xxx.gov/8878main_C_PDP03.DOC 1 / 35 : Fetching http://***.xxx.gov/1329NEW.doc 2 / 35 : Fetching http://***.xxx.gov/LP_Evaluation.doc 3 / 35 : Fetching http://*******.xxx.gov/305.doc ... Emails: ------------1: ***zgpt@***.ksc.xxx.gov 1: ***[email protected] 1: ***ald.l.***[email protected] 1: ****ie.king@****.xxx.gov Names (Person): ------------1: audrey sch*** 1: corina mo**** 1: frank ma**** 2: eileen wa**** 2: saic-odin-**** hq 1: chris wil**** 1: nand lal**** 1: susan ho**** 2: john jaa**** 1: dr. paul a. cu**** 1: *** project/code 470 1: bill mah**** 1: goddard, pwdo - bernadette fo**** 1: joanne wo**** 2: tom naro**** 1: lucero ja**** 1: jenny rumb**** 1: blade ru**** 1: lmit odi**** 2: **** odin/osf seat 1: scott w. mci**** 2: philip t. me**** 1: annie ki**** 207 452_Google_2e_05.qxd 208 10/5/07 12:46 PM Page 208 Chapter 5 • Google’s Part in an Information Collection Framework Usernames: ------------1: cgro**** 1: **** 1: gidel**** 1: rdcho**** 1: fbuchan**** 2: sst**** 1: rbene**** 1: rpan**** 2: l.j.klau**** 1: gane****h 1: amh**** 1: caroles**** 2: mic****e 1: baltn****r 3: pcu**** 1: md**** 1: ****wxpadmin 1: mabis**** 1: ebo**** 2: grid**** 1: bkst**** 1: ***(at&l) Companies: ------------- 1: shadow conservatory [SNIP] The list of companies has been chopped way down to protect the identity of the government agency in question, but the script seems to work well.The script can easily be modified to scrape many more results (across many pages), extract more fields, and get other file types. By the way, what the heck is the one unedited company known as the “Shadow Conservatory?” 452_Google_2e_05.qxd 10/5/07 12:46 PM Page 209 Google’s Part in an Information Collection Framework • Chapter 5 Figure 5.21 Zero Results for “Shadow Conservatory” The tool also works well for finding out what (and if ) a user name format is used. Consider the list of user names mined from ... somewhere: Usernames: ------------1: 79241234 1: 78610276 1: 98229941 1: 86232477 2: 82733791 2: 02000537 1: 79704862 1: 73641355 2: 85700136 From the list it is clear that an eight-digit number is used as the user name.This information might be very useful in later stages of an attack. Taking It One Step Further Sometimes you end up in a situation where you want to hook the output of one search as the input for another process.This process might be another search, or it might be something like looking up an e-mail address on a social network, converting a DNS name to a domain, resolving a DNS name, or verifying the existence of an e-mail account. How do I 209 452_Google_2e_05.qxd 210 10/5/07 12:46 PM Page 210 Chapter 5 • Google’s Part in an Information Collection Framework link two e-mail addresses together? Consider Johnny’s e-mail address [email protected] and my previous e-mail address at SensePost [email protected] link these two addresses together we can start by searching for one of the e-mail addresses and extracting sites, e-mail addresses, and phone numbers. Once we have these results we can do the same for the other e-mail address and then compare them to see if there are any common results (or nodes). In this case there are common nodes (see Figure 5.22). Figure 5.22 Relating Two E-mail Addresses from Common Data Sources If there are no matches, we can loop through all of the results of the first e-mail address, again extracting e-mail addresses, sites, and telephone numbers, and then repeat it for the second address in the hope that there are common nodes. What about more complex sequences that involve more than searching? Can you get locations of the Pentagon data centers by simply looking at public information? Consider Figure 5.23. What’s happening here? While it looks seriously complex, it really isn’t.The procedure to get to the locations shown in this figure is as follows:

- Xem thêm -

Tài liệu liên quan

Xây dựng chính sách an toàn thông tin ( data loss pr...

47

7206

127

Uit ce lab hdh lab5...

11

5780

129

Sql injection và các cách tấn công phổ biến...

29

5661

162

Tổng quan về sniffer và các phương thức tấn công...

31

4186

151

Hack pass wifi wpa wpa2 với backtrack 5 r3...

5

4177

109

Tìm hiểu về thuật toán bảo mật blowfish...

53

3112

128

Uit ce lab hdh lab3...

14

2978

62

Tìm hiểu sql injection và viết công cụ tấn công webs...

19

2752

135

Bài tập môn an toàn bảo mật thông tin...

73

2721

136

Triển khai thành công kỹ thuật tấn công bằng trojan ...

38

2606

98

Ngân hàng đề an toàn và bảo mật mạng...

52

2337

144

Bài thực hành 3 - điều tra website với phần mềm fidd...

15

2142

114

Uit ce lab hdh lab4...

8

2082

76

Thuật toán tìm đường đi ngắn nhất trong router , tài...

120

2078

123

Uit ce lab hdh lab6...

6

2061

52

An toàn mạng...

4

2051

125

Bài tập lớn môn điều khiển số thiết kế theo tiêu chu...

16

2022

85

Xây dựng hệ thống ids – snort trên hệ điều hành linu...

99

1710

86

Giới thiệu về kĩ thuật penetration testing ,Kỹ thuật...

40

1645

110

Bài thực hành 1 - điều tra đĩa cứng và usb với phần ...

16

1642

146

Tài liệu vừa đăng

Tài liệu hướng dẫn triển khai hoạt động giám sát an toàn thông tin trong cơ quan, tổ chức nhà nước

1

56

Tài liệu hướng dẫn sử dụng an toàn các phần mềm, công cụ dạy, học trực tuyến

1

94

Bài giảng dns & an toàn thông tin mạng

1

138

Tài liệu hướng dẫn đánh giá và quản lý rủi ro an toàn thông tin

1

132

Giáo trình bảo mật thông tin phần 1 trường đại học phan thiết

1

128

Nghiên cứu thiết kế mô hình điều khiển giám sát hệ thống đèn giao thông ứng dụng PLC S71200 của Siemens’ DOC

99

257

116

Bài tập thông tin dữ liệu mạng

30

282

59

An toàn bảo mật thông tin

1

79

Bảng tính cau co calculation (Hỗ trợ tải tài liệu zalo 0587998338)

260

294

98

Buchanan_intro_security_network forensics_crc (2011)

484

2

101

Tài liệu xem nhiều nhất

Xây dựng chính sách an toàn thông tin ( data loss prevention)

47

7206

127

Uit ce lab hdh lab5

11

5780

129

Sql injection và các cách tấn công phổ biến

29

5661

162

Tổng quan về sniffer và các phương thức tấn công

31

4186

151

Hack pass wifi wpa wpa2 với backtrack 5 r3

5

4177

109

Tìm hiểu về thuật toán bảo mật blowfish

53

3112

128

Uit ce lab hdh lab3

14

2978

62

Tìm hiểu sql injection và viết công cụ tấn công website dựa trên sqlmap

19

2752

135

Bài tập môn an toàn bảo mật thông tin

73

2721

136

Triển khai thành công kỹ thuật tấn công bằng trojan và backdoor

38

2606

98

Thư viện tài liệu trực tuyến

Hỗ trợ

[email protected]

hotro_xemtailieu

Giúp đỡ

Điều khoản sử dụng Quy định duyệt tài liệu Chính sách bảo mật

Mạng xã hội

Copyright © 2023 Xemtailieu - Website đang trong thời gian thử nghiệm, chờ xin giấy phép của Bộ TT & TT
thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi tài liệu như luận văn đồ án, giáo trình, đề thi, .v.v...Kho tri thức trực tuyến.
Xemtailieu luôn tôn trọng quyền tác giả và thực hiện nghiêm túc gỡ bỏ các tài liệu vi phạm.