Post Format

PHP Parser for Apache “common” Logs

I’m working on a project that needed a parser for Apache log files. Specifically, I’m parsing the default “common” log files generated by Apache 2.

One of the difficulties I have is that log files get big quickly, and I only want to pull the most recent entries (starting from a specified date/time), without reading the whole file.

  • Efficiently retrieve entries from a specific date onwards
  • View elements of each log entry in an associative array (protocol, time, response code, path, referrer, etc.)
  • Ignores HTTP hits which aren’t page views (uses a list of extensions) – room for improvement here perhaps
  • Very efficient compared to reading in the whole file to get a subset of the data in a large log

It’s on Github and quite well documented internally (phpDocs) so maybe useful for other projects too.

https://github.com/vduglued/Apache-Logfile-Parser

Comments are closed but I'd love to hear from you via a reply on your own blog, my contact form or Twitter.