Learning GREP
That looks like weird word, GREP... not a word, though... it's an acronym. A-CROW-NIM.
GREP stands for: Global Regular Expression Print.
The reason this becomes important is that the Macintosh OS works on top of Unix and has a Terminal application that allows me to run Unix commands. For those who are curious, morbidly, the Unix distribution that Mac runs on is: Darwin. This is the same family that FreeBSD comes from.
As a result of running a Unix system with a GUI interface (GUI:: graphical user interface - it's window's people) it is possible to get the ease of use that Mac is known for and the power that Unix is known for. With that said, dropping into the terminal (which I've been doing almost from day one on this Mac) is a satisfying way of locating files, SSH-ing into remote systems, and finding easier ways to accomplish tasks that... well... aren't always that easy to do.
Anyway, at work I have become the security person. The job is not hard; but it can be easier when you realize there is a significant need to parse a large text file (in excess of 75 megabytes a month) into digestible bits. Specifically, I know that when someone is trying a PHP hack on the site the hack is going to have specific characteristics. The outcome to these characteristics allow me to scan the file (currently using a search function in a simple text editor) and then ban the IP addresses that attempt to hack the site.
The problem, though, is this is time consuming and causes eyestrain and headaches. Since I am dealing with random headaches anyway, forcing these headaches into my life isn't exactly number one on my list of things to be a-doing. In a moment of realization, one night, I decided to look up with GREP does. I've seen people use it, but never really thought about the command because... well... there's never really been a need.
However, since my job is a combination of "staff workhorse" and "security specialist" I find myself wanting to expedite the work flow and get things done as quickly as possible so I have more time to do... nothing. Yup. You heard that correct, I work quickly so I can laze about and do nothing. Or, so I have time to do other things that are far more interesting to me like writing, reading, and spending time with Erin.
Okay, so I look up GREP and get the broad definition of what it can do and then open up terminal on my computer and I (intentionally) create a large txt file that I can then grep and parse. My ultimate goal with GREP is to be able to create a search string with two or three variables in it that will allow me to then produce an output file with the lines (complete with IP addresses) that area suspect of being hacking attempts. So far I have been successful with a single search query that generates a text file with the results. I have also been able to get a numbered response, the GREP function telling me how many instances of a particular search query. The outlook is proving to be interesting and... fun. Fun to learn.
Anyway, last night (inspiration happens at night for me and then I implement them during the day... go figure) I was wondering how to take a LOT of log files and parse them down to simple numbers that illustrate what is normal on the work website and then what a new program the school and BYU-TV did over the weekend and, moreover, how that affects the traffic on our website. My boss told me to use some "streaming server" log files he'd generated and then didn't want to stare at. Since I am willing to stare at log files, though not anxious to, I was given the job. The requirement: make it something that is easily understood.
The inspiration that came was that I knew we had a total of six different files we were concerned with and that the logs recorded information on those six files and others. The total number of files (all mp4's) is important over a two-day period; but the real question was total of the six files that we wanted to stream AND individual downloads of each of the six files. This will make a nice little chart. Because the names of the files and their extensions are all known items to me, running a GREP command told me, in like six seconds, how many times each file had been streamed.
What this did, fortunately - and unfortunately, was to expedite something I thought would take the better part of the morning. Granted, now I need to create some kind of a report that shows the numbers... both the ones I came up with yesterday and the ones I came up with today; and that will take me the rest of the morning to do it (mostly because I second guess myself a lot on reporting methods), but the outcome is that the website had a drastic increase in hits over the weekend; that number has decreased since then, but maintained an increased number, and the family and parenting video the school is concerned with has proven to be extremely popular.
And learning GREP is fun.
John Hattaway | smokingpen | Alicia Grey | Clockwork Princess | Cassandra West
Real Heroes Fly