Home > Linux > Monitor changed files on Linux using find command and XML+XSLT

Monitor changed files on Linux using find command and XML+XSLT

September 25th, 2009

Once I’ve decided to write my own monitor for updated files on my Linux server. I’ve selected XML files as storage, and Bash-scripts and Cron as monitor.

Bash script findnewfiles.sh for generating XML-file with list of daily changed files looks like:

#!/bin/bash
echo '<?xml version="1.0" encoding="utf-8"?>'
echo "<?xml-stylesheet type='text/xsl' href='template.xsl'?>"
echo '<files>'
find /var/www/vhosts/ -mtime -1 -print | /var/www/newfiles/findfilter.pl
echo '</files>'

Additionally I’ve used a filter findfilter.pl for excluding logfiles, dirs etc.

#!/usr/local/bin/perl -w

use strict;
use warnings;

use POSIX qw(locale_h strftime);

while (my $filename = <>) {
chomp($filename);

if (length($filename) &&
    $filename !~ m/webstat(\-ssl)?/ &&
    $filename !~ m#/statistics/webstat(\-ssl)?/# &&
    $filename !~ m#/statistics/ftpstat/# &&
    $filename !~ m#/statistics/logs# &&
    $filename !~ m#/templates_c$# &&
    $filename !~ m#/statistics/(anon_)?ftpstat# &&
    ! (-d $filename)
   ) {
   use File::stat;
   my $sb = stat($filename);
   print "\t",'',"\n";
   print "\t\t", "",$filename,"\n";
   print "\t\t", "",strftime ("%a %b %e %H:%M:%S %Y", localtime $sb->mtime),"\n";
   print "\t\t", "",(getpwuid($sb->uid))[0],"\n";
   print "\t\t", "",(getgrgid($sb->gid))[0],"\n";
   print "\t\t", "",$sb->size,"\n";
   print "\t\t", "",sprintf("%04o",$sb->mode & 07777),"\n";

   print "\t",'',"\n";
}
}

1;

Bash-script startfindnewfiles.sh for cron:

#!/bin/bash
cd /var/www/newfiles/
dd=`date "+%Y-%m-%d"`
./findnewfiles.sh | gzip > "$dd.xml.gz"
echo "http://yourserver.com/newfiles/?date=$dd"

Notice. I’ve used gzip compression for disk space saving.

For formatting the output I’ve used an XSL template. For better usability I’ve added here an jQuery Plugin Tablesorter. It allows you to sort data in table clicking on the column header. XSL Template source:

<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="stylesheet" type="text/css" href="themes/style.css" media="screen"/>
<link rel="stylesheet" type="text/css" href="styles.css" media="screen"/>
</head>
<body>
<div align="center"><a href="index.php">back</a></div>
<table id="myTable">
<thead>
<tr>
<th>Filename</th>
<th>Modify time</th>
<th>Owner</th>
<th>Size</th>
<th>Rights</th>
<th>Group</th>
</tr>
</thead>
<tbody>
<xsl:for-each select="files/file">
<tr>
<td><xsl:value-of select="name"/></td>
<td nowrap="nowrap"><xsl:value-of select="mtime"/></td>
<td nowrap="nowrap"><xsl:value-of select="owner"/></td>
<td nowrap="nowrap"><xsl:value-of select="size"/></td>
<td nowrap="nowrap"><xsl:value-of select="mode"/></td>
<td nowrap="nowrap"><xsl:value-of select="group"/></td>
</tr>
</xsl:for-each>
</tbody>
</table>
<div align="center"><a href="index.php">back</a></div>
<script type="text/javascript" language="javascript" src="js/jquery.js" />
<script type="text/javascript" language="javascript" src="js/jquery.tablesorter.js" />
<script type="text/javascript">
<xsl:comment>
$(document).ready(function() {
$("#myTable").tablesorter({
sortList:[[1,1],[0,0]]
});
}
);
</xsl:comment>
</script>
</body>
</html>

And finally, the index.php script code:

<?php
header('Content-type: text/html; charset=utf-8');

ob_start();

if (!isset($_GET['date']) && !isset($argv[1])) {
 echo '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Modified files</title>
<link rel="stylesheet" type="text/css" href="styles.css">
</head>
<body><div align="center">';
 if ($handle = opendir(getcwd()))  {
 $files = array();

 while (false !== ($file = readdir($handle))) {
 if (preg_match('/^\d{4}\-\d{2}\-\d{2}\.xml\.gz$/',$file)) {
 $files[] = $file;
 }
 }
 natcasesort($files);
 $files = array_reverse($files);

 foreach($files as $file) {
 $d = preg_replace('/\.xml\.gz$/','', $file);
 printf('<a href="?date=%s">%s</a><br>', $d, dateToSovok($d));
 }
 }
echo '</div></body></html>';
} else {
 $date = isset($_GET['date']) ? $_GET['date'] : $argv[1];

 if (!is_file($date.'.xml.gz')) {
 header('Location: index.php');
 die;
 }

 // Load the XML source
 $xml = new DOMDocument;
 $xml->loadXML(gzdecode(file_get_contents($date.'.xml.gz')));

 $xsl = new DOMDocument;
 $xsl->load('template.xsl');

 // Configure the transformer
 $proc = new XSLTProcessor;
 $proc->importStyleSheet($xsl); // attach the xsl rules

 $doc = $proc->transformToDoc($xml);
 echo $doc->saveHTML();
}

function dateToSovok($dt) {
 $pos1 = strpos($dt,'-');
 $pos2 = strrpos($dt,'-');

 $year = substr ($dt, 0, $pos1);
 $month = substr ($dt, $pos1 + 1, $pos2-$pos1-1);
 $day = substr ($dt, $pos2 + 1, strlen($dt));
 return ($day.".".$month.".".$year);
}
$content = ob_get_clean();

if(function_exists('gzencode') && ($encoding = checkCanGzip()) ) {
 header("Content-Encoding: ".$encoding);
 echo gzencode( $content . '<!-- gzencoded -->', 6 );
} else
 echo $content . '<!-- without compression -->';

/*  ------------------------------------------------------------ */

function checkCanGzip() {
 global $_SERVER;;

 if (!isset($_SERVER['HTTP_ACCEPT_ENCODING'])) return 0;
 if (strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'x-gzip') !== false) return "x-gzip";
 if (strpos($_SERVER['HTTP_ACCEPT_ENCODING'],'gzip') !== false) return "gzip";
 return 0;
}

function gzdecode($data) {
 $len = strlen($data);
 if ($len < 18 || strcmp(substr($data,0,2),"\x1f\x8b")) {
 return null;  // Not GZIP format (See RFC 1952)
 }
 $method = ord(substr($data,2,1));  // Compression method
 $flags  = ord(substr($data,3,1));  // Flags
 if ($flags & 31 != $flags) {
 // Reserved bits are set -- NOT ALLOWED by RFC 1952
 return null;
 }
 // NOTE: $mtime may be negative (PHP integer limitations)
 $mtime = unpack("V", substr($data,4,4));
 $mtime = $mtime[1];
 $xfl   = substr($data,8,1);
 $os    = substr($data,8,1);
 $headerlen = 10;
 $extralen  = 0;
 $extra     = "";
 if ($flags & 4) {
 // 2-byte length prefixed EXTRA data in header
 if ($len - $headerlen - 2 < 8) {
 return false;    // Invalid format
 }
 $extralen = unpack("v",substr($data,8,2));
 $extralen = $extralen[1];
 if ($len - $headerlen - 2 - $extralen < 8) {
 return false;    // Invalid format
 }
 $extra = substr($data,10,$extralen);
 $headerlen += 2 + $extralen;
 }

 $filenamelen = 0;
 $filename = "";
 if ($flags & 8) {
 // C-style string file NAME data in header
 if ($len - $headerlen - 1 < 8) {
 return false;    // Invalid format
 }
 $filenamelen = strpos(substr($data,8+$extralen),chr(0));
 if ($filenamelen === false || $len - $headerlen - $filenamelen - 1 < 8) {
 return false;    // Invalid format
 }
 $filename = substr($data,$headerlen,$filenamelen);
 $headerlen += $filenamelen + 1;
 }

 $commentlen = 0;
 $comment = "";
 if ($flags & 16) {
 // C-style string COMMENT data in header
 if ($len - $headerlen - 1 < 8) {
 return false;    // Invalid format
 }
 $commentlen = strpos(substr($data,8+$extralen+$filenamelen),chr(0));
 if ($commentlen === false || $len - $headerlen - $commentlen - 1 < 8) {
 return false;    // Invalid header format
 }
 $comment = substr($data,$headerlen,$commentlen);
 $headerlen += $commentlen + 1;
 }

 $headercrc = "";
 if ($flags & 1) {
 // 2-bytes (lowest order) of CRC32 on header present
 if ($len - $headerlen - 2 < 8) {
 return false;    // Invalid format
 }
 $calccrc = crc32(substr($data,0,$headerlen)) & 0xffff;
 $headercrc = unpack("v", substr($data,$headerlen,2));
 $headercrc = $headercrc[1];
 if ($headercrc != $calccrc) {
 return false;    // Bad header CRC
 }
 $headerlen += 2;
 }

 // GZIP FOOTER - These be negative due to PHP's limitations
 $datacrc = unpack("V",substr($data,-8,4));
 $datacrc = $datacrc[1];
 $isize = unpack("V",substr($data,-4));
 $isize = $isize[1];

 // Perform the decompression:
 $bodylen = $len-$headerlen-8;
 if ($bodylen < 1) {
 // This should never happen - IMPLEMENTATION BUG!
 return null;
 }
 $body = substr($data,$headerlen,$bodylen);
 $data = "";
 if ($bodylen > 0) {
 switch ($method) {
 case 8:
 // Currently the only supported compression method:
 $data = gzinflate($body);
 break;
 default:
 // Unknown compression method
 return false;
 }
 } else {
 // I'm not sure if zero-byte body content is allowed.
 // Allow it for now...  Do nothing...
 }

 // Verifiy decompressed size and CRC32:
 // NOTE: This may fail with large data sizes depending on how
 //       PHP's integer limitations affect strlen() since $isize
 //       may be negative for large sizes.
 if ($isize != strlen($data) || crc32($data) != $datacrc) {
 // Bad format!  Length or CRC doesn't match!
 return false;
 }
 return $data;
}
?>

Download all source codes Monitor changed files on Linux using find command and XML+XSLT (28 kb)

Categories: Linux Tags: ,
  1. tesla
    September 25th, 2009 at 20:59 | #1

    Oh, man! It’s really useful thing! Thanks for u!!

  2. Polprav
    October 21st, 2009 at 16:31 | #2

    Hello from Russia!
    Can I quote a post in your blog with the link to you?

  3. alex
    November 2nd, 2009 at 13:45 | #3

    Do you want to quote my post in your blog? I don’t mind, but you need to give a complete link on my blog without any nofollow and others. Send me a link to your blog, I’ll read ;)

Comments are closed.