Contents...
My manager ask me to grep some web code to investigate website status. That’s why I need to look nginx access log. After spending more time on internet. I found some command to parsing website access log.
Nginx defaul log format named is “ main” in nginx version: nginx/1.10.1.
log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"';
But unfortunately defaul log format was not sufficient for me. Than I configured nginx to work as a reverse proxy to manage web traffic load. I have configured my custom log format named “upstream “.
log_format upstream '$http_x_forwarded_for - $remote_user [$time_local] "$request" ''$status $body_bytes_sent "$http_referer" ''"$http_user_agent" "$remote_addr"' 'rt=$request_time ut=$upstream_response_time ''[for $host via $upstream_addr] "$gzip_ratio"';
Here :-
- $http_x_forwarded_for : Log the real user IP instead of our nginx proxy IP.
- $remote_user : Most modern app does not use this option for HTTP authenticate with user. It will be blan for most apps.
- [$time_local] : Logged request time-stamp as per server timezone.
- “$request” : Request type like- GET, POST, PUT etc.. with args and version.
- ‘$status $body_bytes_sent’ : Response code generated buy server.
- “$http_referer”: URL referral
- rt=$request_time : Response time
- ut=$upstream_response_time : Upstream response time.
- $host : Host-name
- $upstream_addr : Upstream IP address from which request made here our nginx server IP.
- “$gzip_ratio” : Logged compress ration.
Add the” upstream “parameter at the end of access_log line to tell nginx use custom log format like below :
access_log /var/log/nginx/vhost/looklinix.com_access.log upstream;
Lets go ahead for parsing nginx access log using command.
List All HTTP Response Status Codes :
# cd /var/log/nginx/ # cat looklinix.com_access.log | cut -d '"' -f3 | cut -d ' ' -f2 | sort | uniq -c | sort
You will get some output like below:
114 500
12277 304
123 416
1 504
1614 302
185148 200
1892 301
2188 404
2508 499
27 403
32 401
3 400
366 206
456 502
477 444
79 405
Using AWK command :
# awk '{print $9}' looklinix.com_access.log | sort | uniq -c | sort
You will get some output like below:
114 500
12277 304
123 416
1 504
1614 302
185148 200
1892 301
2188 404
2508 499
27 403
32 401
3 400
366 206
456 502
477 444
79 405
You can see more than 2000 request returned 404 (Not Found ) response code.
List All Broken (404) Request
Now find out which request are broken and getting 404. We can see all 404 visited page using below command.
# awk '($9 ~ /404/)' looklinix.com_access.log | awk '{print $7}' | sort | uniq -c | sort -r
You will get some output like below:
9 /wp-content/uploads/sfn.php 9 /wp-content/plugins/woocommerce-product-options/includes/image-upload.php 9 /wp-content/plugins/revslider/temp/update_extract/sfn.php 9 /wp-content/plugins/revslider/temp/update_extract/revslider/db.php 9 /wp-content/plugins/revslider/sfn.php 9 /wp-content/plugins/Login-wall-etgFB/login_wall.php?login=cmd&z3=c2ZuLnBocA%3D%3D&z4=L3dwLWNvbnRlbnQvcGx1Z2lucy8%3d 9 /wp-content/plugins/jquery-html5-file-upload/jquery-html5-file-upload.php 9 /wp-content/plugins/formcraft/file-upload/server/php/upload.php 9 /tiny_mce/plugins/tinybrowser/upload_file.php?folder=/&type=file&feid=&obfuscate=&sessidpass= 9 /sfn.php 9 /license.php 8 /wp-content/uploads/2017/01/mail-client.jpg&_nc_hash=AQCLYYiKAUaGBX1g 8 /wp-content/plugins/wptf-image-gallery/lib-mbox/ajax_load.php?url=/etc/passwd 8 /wp-content/plugins/wp-symposium/server/php/index.php 8 /wp-content/plugins/wp-mobile-detector/cache/db.php 8 /wp-content/plugins/wp-ecommerce-shop-styling/includes/download.php?filename=../../../../../../../../../etc/passwd 8 /wp-content/plugins/./simple-image-manipulator/controller/download.php?filepath=/etc/passwd 8 /wp-content/plugins/recent-backups/download-file.php?file_link=/etc/passwd 8 /wp-content/plugins/front-end-upload/destination.php 8 /wp-content/plugins/candidate-application-form/downloadpdffile.php?fileName=../../../../../../../../../../etc/passwd 8 /wp-content/cache/autoptimize/css/autoptimize_e039f9699b9008b4a87e6e80c5bf48b5.css 7 /questions/question/php-fpm-and-nginx-502-bad-gateway/ 7 /?p=1602 7 /author/santosh-prasad/page/4/ 6 /wp-content/uploads/2017/02/web-based-monitoring-tool.jpg&_nc_hash=AQC7GIulBxipvMss 6 /wp-content/themes/infocus/lib/scripts/dl-skin.php 6 /wp-content/plugins/simple-ads-manager/js/slider/tmpl.js 6 /pagead/gen_204?id= 6 /?p=1894 6 /?p=1756 6 /?p=1532 6 /mdocs-posts/?mdocs-img-preview=../../../wp-config.php 6 /easy-steps-to-upgrade-php-5-3-to-php-5-6-on-centos-6-x-and-rhel-6-x/http:%5C/%5C/www.looklinux.com%5C/wp-login.php?action=lostpassword 6 /category/uncategorized/ 5 /wp-content/plugins/google-mp3-audio-player/direct_download.php?file=../../../wp-config.php 5 /wp-content/plugins/db-backup/download.php?file=../../../wp-config.php 5 /wp-content/cache/autoptimize/css/autoptimize_b2fab305691e9655969910635d0b8352.css 5 /sample-page/ 5 /?p=1653
List All 301 Permanently Moved URLs
We can also list all top ten 301 permanently moved URLs using below command.
# awk '($9 ~ /301/)' looklinix.com_access.log | awk '{print $7}' | sort | uniq -c | sort -r | head
You will get some output like below:
5 /looklinux/preview.php?title=vo7fw 8 /how-to-run-process-or-program-on-specific-cpu-cores-in-linux 8 /how-to-access-linux-terminal-using-chrome-web-browser 8 /easy-steps-to-clone-your-hard-drive-using-dd 8 /best-5-linux-open-source-text-editors 8 /awstats/awstats.pl?config=looklinux.com 7 /top-5-web-based-linux-monitoring-tools 1 /basic-mysql-commands-database-administrator 1 /awstats-log-analyzer-installation-and-configuration-on-centos-fedora-and-rhel-system/ 1 /awstats-log-analyzer-installation-and-configuration-on-centos-fedora-and-rhel-system 1 /author/santosh-prasad/page/2/?ap_ajax_action=search_mentions&%23038;action=ap_ajax
We can also check from which source IP you are 404 request are coming .
# awk -F\” ‘($2 ~ “/survey/report/na”){print $1}’ looklinix.com_access.log | awk ‘{print $1}’ | sort | uniq -c | sort –r
You will get some output like below Command:
3 197.210.28.52,107.167.112.38 2 63.139.29.90 2 175.139.178.106 1 99.95.1.42 1 87.112.31.172 1 86.7.38.59 1 86.163.13.111 1 86.156.51.243 1 86.153.19.27 1 86.149.8.56 1 86.139.199.120 1 86.130.71.225 1 86.129.148.242 1 85.255.235.145 1 85.211.50.44 1 84.51.152.254 1 82.40.13.61 1 82.21.137.68 1 82.15.34.126 1 81.157.121.242 1 81.155.254.63
I hope this article will help to parsing your Nginx log. If you have any queries and problem please comment in comment section.
Thanks:)
If you find this tutorial helpful please share with your friends to keep it alive. For more helpful topic browse my website www.looklinux.com. To become an author at LookLinux Submit Article. Stay connected to Facebook.
Leave a Comment