type cache:your.website.here in a google search
note the return URL in the browser - save this (must have the IP)
#!/bin/bash
set -o errexit
stamp=`date`
touch temp.txt
lynx -dump -accept_all_cookies "cached.url.here" | grep 'retrieved' | cut -c 4-50>>temp.txt
cache=`cat
echo $stamp Google $cache>>/path/to/desired/dir/file.txt
Check out the documentation on the cut command which gets the info from grep, this truncates the number of characters passed to the temp.txt, adjust to what you need to get the desired result.
this should give you a return result like this:
Fri Aug 31 14:18:05 CDT 2007 Google retrieved on Aug 30, 2007 13:49:11 GMT.
Tue Sep 4 09:10:20 CDT 2007 Google retrieved on Aug 31, 2007 14:35:14 GMT.
Wed Sep 5 07:51:55 CDT 2007 Google retrieved on Sep 2, 2007 15:52:02 GMT.
Thu Sep 6 13:01:19 CDT 2007 Google retrieved on Sep 4, 2007 22:35:39 GMT.
Fri Sep 7 07:00:00 CDT 2007 Google retrieved on Sep 5, 2007 13:25:22 GMT.
Sat Sep 8 07:00:00 CDT 2007 Google retrieved on Sep 6, 2007 13:28:59 GMT.
Sun Sep 9 07:00:00 CDT 2007 Google retrieved on Sep 8, 2007 08:19:05 GMT.
Mon Sep 10 07:00:00 CDT 2007 Google retrieved on Sep 8, 2007 08:19:05 GMT.
Tue Sep 11 07:00:00 CDT 2007 Google retrieved on Sep 10, 2007 08:54:21 GMT.
Wed Sep 12 07:00:00 CDT 2007 Google retrieved on Sep 10, 2007 23:52:44 GMT.
a quick a dirty log of when Google Crawled my site, I then just threw this to crontab to run every morning at 4am, and my browser is set to open this link upon activation.
Enjoy.
No comments:
Post a Comment