Hieroglyphics Big Time

A while ago I had to parse some logs files from an Apache web server. However the file format was designed for human inspection and not for an easy parsing:

193.136.93.77 - - [27/Jan/2013:14:04:16 +0000] "GET /paasmanager/apidocs/index.html HTTP/1.1" 404 514 "-" "Opera/9.80 (Windows NT 6.1; WOW64; U; en)"

Regular Expressions were coming to town and I confess that I was not a big fan till now.. After some googling I found an amazing cheat sheet (thanks to Dave) that helped me a lot to design my expression.

^([\d.]+) (\S+) (\S+) \[([\w:/]+\s[+\-]\d{4})\] "(.+?)" (\d{3}) (\d+) "([^"]+)" "([^"]+)"

After a while, I was already coding some grep-like tools 🙂

Advertisements
Tagged , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: