Splunk is the bee’s knees when it comes to analysing and visualising log file data. If you haven’t used it yet, I highly recommend it: Splunk. I’ve been working with it for quite a while but recently I’ve had a requirement to keep the license cost down whilst monitoring log files incrementally. License costs are based on how much data you consume daily.
Historically I have just been overwriting the full log file locally every 10 minutes and having Splunk index on that directory. But to keep the license costs down I want to filter out the required data first, before indexing. This will no doubt speed things up as well, as I find I only need 10% of the data in the files.
Also, I have historically had an issue occasionally with Splunk double indexing requests. It’s not clear exactly why this occurs but it seems to be impacted by the way you copy over your log files. I have some workarounds below that reduce this problem.
Extract (grep for Windows)
To filter the data before indexing I follow a couple of steps. First step is to copy the full log file to a local temp directory and grep for the lines I want to keep. I do have unix utils and cygwin but grep from these wasn’t up to the job so I’ve written a couple of VB scripts to do this:
The main thing these do that was tricky with unix grep for windows is to read in a file of multiple grep terms, so a typical command line is:
cscript "c:\vbs scripts\not_grep_by_field_to_new_file.vbs" temp.log c:\logs01\grep_list_for_project1.txt 7
(“This script RETRIEVES the lines from the first input file where the terms in the second input file are NOT found in the specified field”)
Transform and Load
(Not doing any data transformation in this case)
To load the file we are a bit careful. This is where I have seen issues with double indexing data and have worked around most of these. I still have some issues and have made one last change. I need to wait a week to see if the issue has fully gone. Details below. [UPDATE: the process presented on this page is now full working for me]
[Latest update: to implement full local log file rollover based on our local update time period e.g. every ten minutes]
- I have Splunk constantly monitoring a directory on my local desktop for files called ‘access.log.*’
- Every 10 minutes (using a local Jenkins) the full access log is downloaded to temp.log and grepped to ‘grepped_temp.log’ (see above)
- We then use a vb script to compare the last grep result to the current grep result
- if the current grep result has more lines, then we copy these new lines to a buffer (this step has been updated several times, appending, cp and mv have all caused me double indexing issues)
- if the current grep result is the same as the last grep result, don’t do anything
- if the current grep has fewer lines than the last grep AND the first lines in the files are different^, then assume log file rollover on the server and start anew locally.
- If we have new lines (in the buffer or a whole new file ‘cos of server log file rollover) then delete all the ‘access.log.*’ files locally and make a new rollover file called ‘access.log.datestamp’. This new file contains the new lines (either from the buffer or the whole new file)
cscript "c:\backup\vbs scripts\rollover_log_file_date.vbs" grepped_temp.log access.log