A tool to analyze my Apache Log ?

I wanted to view my web server log statistics. They provide much more useful information than e.g. the advanced webcounter ‘Google Analytics’’. I have a few thousands files per hour of the last period that I want to scan and produce a report of.

1. urchin

I went to the ‘urchin’ of my host which displays:

image

… sigh (empty)

2. Dowloading the Log files

I downloaded 3698 log files from my host to my local machine:

image

3. Flashstats 2006

I downloaded FlashStats 2006, installed it, gave it a test run (looks good, just point to the dir and run) and then a REAL run. in the middle of the dns checks it breaks and on all further tries to start to program it gives;

image

I try to go to the registry to see if there is a bool to start the thing anyway, I read the support docs but in the end… i mailed the issue to the mail address and well… next. (I also do not want to spend $99 on my own blog analysis).

4. AW Stats

I downloaded the famous AW Stats and try to look at the docs:

image

Sigh… luckily there are docs in /docs locally.

I install it, configure the config file, and let it adjust my local httpd.conf.

1. I find out that I can point only to 1 specific log file and should combine therefore 3698 files in one file or in whatever subfiles… yeah right
2. I can not get http://127.0.0.1/awstats/awstats.pl?config=awstats.MY.SITE.conf to work: access denied
3. I can not even produce the reports:

perl awstats_buildstaticpages.pl -config=AWSTATS.mysite.CONF -update -awstatsprog=”c:\Program Files\AWStats\wwwroot\tools\cgi-bin\awstats.pl” -dir=”c:\Program Files\AWStats\wwwroot\”" —> Error: Can’t find AWStats program (‘c:\Program Files\AWStats\wwwroot\tools\cgi-b in\awstats.pl’).

StackOverflow is silent… DELETE awstats … next

5. W3Perl

I downloaded W3Perl. It says it wants activestate Perl but I hope it agrees to use Strawberry Perl….

I try to follow the instructions here “open a shell (from the windows menu ‘Programs’->’Accessories’->’Command Prompt’), change directory to the W3Perl one (cd C:/Program Files/W3Perl/) and run the cron-w3perl.pl script with the -a -c config-win.pl flag to start from scratch (cron-w3perl.pl -a). If you just need to update stats, run the same script with the -e -c config-win.pl flag.”

cron-w3perl.pl –a –c config-win.pl

image

… always nice such good instructions , rename config-win.pl to config.pl:

image

Ah… this one also needs me to combine 3699 log files in 1 file of over 3 GB … ah.. lets combine 1 day (easy with Total Commander combine files, it does it for the 00-23 hrs, so per day but unfortunately you have to go through each day…):

It looks nice, yes I understand that it looks for the current day stats, but lets see… ah … no:

image

But.. it looks promising and at least delivers me stats pages:

image

So i now have the problem:

a) how do I let it process my 3699 log files automatically?
b) how do i get rid of that last error?

Unfortunately… my time is over and I need todo other stuff (I ticked reverse DNS and indeed even hours later… the thing is still busy) (just for that 1 day log).

6. WebLog Expert Lite

I downloaded weblog expert, which has a REAL simple window interface that i just need to point to my log directory.

And.. it output the reports. Simple!

image

But… it has no reverse DNS and it seems to miss a lot of information I am looking for (unlike the FlashStats tool). Maybe however that is in the PRO version of Weblog Expert.

Still it shows me interesting information.

Let me try that one now.

6. WebLog Expert Standard/Pro

This one ALSO costs $99 … so the chance is low that I am going to buy this.

I start it up and it has the options I expected like reverse DNS lookups. It also lets me specifiy a data-time range. AND it is pretty quick scanning the thousands of files.

I did not check reverse DNS this time (will take years) but I did check retrieve page titles… on which it is now working…

 

 

Conclusion

I liked what FlashStats 2006 did. Unfortunately that broke on the reverse DNS lookups and it is $99.

Is there somewhere a free Windows program which you can just point to the logs dir and which generates the reports and which does not break on reverse dns lookups and which reports hundreds of reports and which lets me select dates to dynamically rebuild the reports based on the range?

(found some more progs to try here: http://www.fileguru.com/apps/comanche_for_apache/p3 )

Internal Dummy Connections and adjusting HTTPD log files

My server has A LOT of Internal Dummy Dummy Connections in the form of:

127.0.0.01 – - [date] "OPTIONS * HTTP/1.0" 200 -

image

These seem to be Internal Dummy Connections: http://wiki.apache.org/httpd/InternalDummyConnection (i am running Apache/2.2.3 (CentOS) so I ALSO need to upgrade so it seems but that is uhm very difficult when Plesk is installed and there are dependencies that I do not even know of).

And they happened after I changed the host file on my Linux Server (etc/hosts). I changed the hosts file to 2 lines instead of one because otherwise the WordPress Cron functionality would not run (see http://bradt.ca/archives/fix-wordpress-missed-schedule-error-on-media-temple-dv-plesk/) which has todo with plesk (http://www.fdcservers.net/vbulletin/archive/index.php/t-1018.html).

127.0.0.1       localhost localhost.localdomain
205.186.136.89  farmvillechicken.com farmvillechicken

Directly after making these changes the internal Dummy Connections started entering my /var/log/httpd/access_log ‘s

The Apache Help page says the following:

If you wish to exclude them from your log, you can use normal conditional-logging techniques. For example, to omit all requests from the loopback interface from your logs, you can use

SetEnvIf Remote_Addr "127\.0\.0\.1" loopback

and then add env=!loopback to the end of your CustomLog directive.

However… the "normal conditional logging techniques" are unknown to me… Via

http://httpd.apache.org/docs/2.0/mod/mod_log_config.html I think the CustomLog line is correct:

SetEnvIf Remote_Addr "127\.0\.0\.1" loopback
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
CustomLog logs/access_log common env=!loopback
CustomLog logs/referer_log referer env=!loopback
CustomLog logs/agent_log agent env=!loopback

So let’s add that to /etc/httpd/conf/httpd.conf and run /usr/sbin/apachectl configtest:

Syntax error on line 485 of /etc/httpd/conf/httpd.conf:SetEnvIf regex could not be compiled.

(to show line number in vi you can use ":set number")

ah… I forgot to put a "." before the "1" in the regex…. save: Syntax OK.

Let’s reboot by doing a /etc/init.d/httpd restart and then check the log files.

image 

GRBML! It did not work. Let’s head for a forum.

How to enable simplexml on a Synology Box / How to run 2 apaches on a Synology Box

image I heard about OpenGoo, a webapplication which you can host at your own server and basically let’s you do the “office” things: contacts, todo, documents, spreadsheets, calender, task, links, and so on.

Although it probably lacks a lot of features this might be handy to install in our household. I really like to have all my data within my own household and I believe distributed everything is basically the way to go.

The main problem are the memory demands, it requires really some MB’s while my Synology CS 407 is already running uhm… a lot.

But let’s try anyway.

image - Download the zip file from the SourceForge site
- Just copy to your /web directory
- Open the installation page by just typing in the url of the place you copied it to (e.g. http://cubestation/doc)
- You get a  nice message and have to click next
image - The next page warns you aboutsimplexml extension is not installed”
- According to Synology however this extension is not available yet in their default php release (no libxslt.so). However I read about  some “hack” : “install a second apache with a full enabled PHP —> ipkg install php-apache”.

Some questions pop in my mind now… ah…. what the heck.

CubeStation> ipkg list php*
php – 5.2.6-2 – The php scripting language
php-apache – 5.2.6-1 – The php scripting language, built as an apache module
php-curl – 5.2.6-2 – libcurl extension for php
php-dev – 5.2.6-2 – php native development environment
php-embed – 5.2.6-2 – php embedded library – the embed SAPI
php-fcgi – 5.2.6-1 – The php scripting language, built as an fcgi module
php-gd – 5.2.6-2 – libgd extension for php
php-imap – 5.2.6-2 – imap extension for php
php-ldap – 5.2.6-2 – ldap extension for php
php-mbstring – 5.2.6-2 – mbstring extension for php
php-mssql – 5.2.6-2 – mssql extension for php
php-mysql – 5.2.6-2 – mysql extension for php
php-odbc – 5.2.6-2 – odbc extension for php
php-pear – 5.2.6-2 – PHP Extension and Application Repository
php-pgsql – 5.2.6-2 – pgsql extension for php
php-thttpd – 2.25b-5.2.6-1 – php-thttpd is thttpd webserver with php support
phpmyadmin – 2.6.2-2 – Web-based administration interface for mysq

ipkg list php* hmmm…

CubeStation> ipkg install php-apache
Installing php-apache (5.2.6-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/php-apache_5.2.6-1_arm.ipk
package apr-util suggests installing sqlite
package apr-util suggests installing openldap-libs
Installing apache (2.2.10-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/apache_2.2.10-1_arm.ipk
Installing apr (1.3.3-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/apr_1.3.3-1_arm.ipk
Installing apr-util (1.3.4-2) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/apr-util_1.3.4-2_arm.ipk
Installing e2fslibs (1.40.3-5) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/e2fslibs_1.40.3-5_arm.ipk
Installing expat (2.0.1-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/expat_2.0.1-1_arm.ipk
Installing gdbm (1.8.3-2) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/gdbm_1.8.3-2_arm.ipk
Installing libdb (4.2.52-3) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/libdb_4.2.52-3_arm.ipk
Installing e2fsprogs (1.40.3-5) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/e2fsprogs_1.40.3-5_arm.ipk
Installing openssl (0.9.7m-4) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/openssl_0.9.7m-4_arm.ipk
Installing zlib (1.2.3-3) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/zlib_1.2.3-3_arm.ipk
Installing openldap-libs (2.3.43-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/openldap-libs_2.3.43-1_arm.ipk
Installing cyrus-sasl-libs (2.1.22-2) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/cyrus-sasl-libs_2.1.22-2_arm.ipk
Installing php (5.2.6-2) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/php_5.2.6-2_arm.ipk
Installing bzip2 (1.0.5-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/bzip2_1.0.5-1_arm.ipk
Installing libxml2 (2.7.1-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/libxml2_2.7.1-1_arm.ipk
Installing libxslt (1.1.24-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/libxslt_1.1.24-1_arm.ipk
Installing pcre (7.8-1) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/pcre_7.8-1_arm.ipk
Installing libstdc++ (6.0.3-6) to root…
Downloading http://ipkg.nslu2-linux.org/feeds/optware/syno-x07/cross/unstable/libstdc++_6.0.3-6_arm.ipk
Configuring apache
update-alternatives: Linking //opt/sbin/htpasswd to /opt/sbin/apache-htpasswd
update-alternatives: Linking //opt/sbin/httpd to /opt/sbin/apache-httpd
httpd: Could not reliably determine the server’s fully qualified domain name, using 192.168.1.70 for ServerName
httpd (no pid file) not running
httpd: Could not reliably determine the server’s fully qualified domain name, using 192.168.1.70 for ServerName
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:8000
no listening sockets available, shutting down
Unable to open logs
postinst script returned status 1
ERROR: apache.postinst returned 1
Configuring apr
Configuring apr-util
Configuring bzip2
update-alternatives: Linking //opt/bin/bzip2 to /opt/bin/bzip2-bzip2
Configuring cyrus-sasl-libs
Configuring e2fslibs
Configuring e2fsprogs
update-alternatives: Linking //opt/bin/chattr to /opt/bin/e2fsprogs-chattr
update-alternatives: Linking //opt/bin/lsattr to /opt/bin/e2fsprogs-lsattr
update-alternatives: Linking //opt/sbin/fsck to /opt/sbin/e2fsprogs-fsck
Configuring expat
Configuring gdbm
Configuring libdb
Configuring libstdc++
Configuring libxml2
Configuring libxslt
Configuring openldap-libs
Configuring openssl
Configuring pcre
Configuring php
Configuring php-apache
Configuring zlib
Successfully terminated.

1) vi /opt/etc/apache2/httpd.conf

(in /opt/lib/php/extensions is now the needed xsl.so)

(also change all libexec/bla.so to /opt/libexec/…)
(also change log file to correct location e.g. /opt/var/apache2/log)

Then edited the portnumber to 81 and changed the default webpath.

2) run /opt/sbin/httpd –k start

and we are running another instance on port 81:

image

Virtual hosts without DNS for Synology CS 407 Cubestation

The webserver out-of-the-box works great on the NAS but it was not so handy that I only had 1 documentroot, meaning: http://cubestation and http://cubestation/leau would both have as root http://cubestation. This is unhandy because I have seperate sites on the cube which have html code that is often relative to the root. Meaning: I have to rewrite the code or use some hacks.

However, with the help of synology.nl forum and the help of Patrick on the synology.com forum I managed to get it working.

I edited /usr/syno/apache/conf/httpd.conf-user and uncommented the virtual hosts line:

Include conf/extra/httpd-vhosts.conf

Then I added the conf/extra/httpd-vhosts.conf and added 1 subsite to test “leau.cubestation” :

NameVirtualHost *:80
<VirtualHost *:80>
  ServerName cubestation
  DirectoryIndex index.php index.html index.htm index.shtml
  DocumentRoot /var/services/web
  <Directory “/var/services/web”>
   AllowOverride all
  </Directory>
</VirtualHost>
<VirtualHost *:80>
ServerName leau.cubestation
DirectoryIndex index.php index.html index.htm index.shtml
DocumentRoot /var/services/web/leau
<Directory “/var/services/web/leau”>
  AllowOverride all
</Directory>
</VirtualHost>

And then I added this new subsite to the /etc/hosts file:

127.0.0.1       localhost
192.168.1.70    CubeStation
192.168.1.70    leau.cubestation

After a reboot it works perfectly from my browser. I can now add all other websites as virtual roots to the cubestation without needing domain names.