IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    我的日志分析之道:简单的Web日志分析脚本

    铁匠发表于 2017-02-15 06:23:23
    love 0

    前言

    长话短说,事情的起因是这样的,由于工作原因需要分析网站日志,服务器是windows,iis日志,在网上找了找,github找了找,居然没找到,看来只有自己动手丰衣足食。

    那么分析方法我大致可分为三种:

    1. 基于时间:将请求url按时间段分类,那么我们根据每个时间段的url数量及攻击数量就可以大致判断出哪个时间段有apt类型攻击,哪个时间段是扫描器行为;

    2. 基于攻击ip:正常的攻击肯定会有请求被记录(当然你要是有0day当我没说,正常的探测总会有吧=。=!),然后每个ip去分析;

    3. 基于访问请求的状态码,也大致可以判断出行为。

    规则可以基于开源waf规则,分析扫描器写正则也可以,开源waf地址是

    https://github.com/loveshell/ngx_lua_waf/tree/master/wafconf。

    扫描器正则https://github.com/smarttang/w3a_SOCD的database里面有详细地址

    https://github.com/smarttang/w3a_SOC/tree/master/db_sql。

    Sql语句里面有想把它做的功能全一些,但是学python学习时间也不是很长,写出来的代码也没有pythonic,会慢慢写的。目前分三个模块,一个日志归类模块命名为url.py,攻击分析模块attac.py, ip地理位置查询模块ipfind.py,还有一个主函数。

    日志归类模块url.py

    <span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">re
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">os
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">sys
    </span><span lang="en-us" xml:lang="en-us">from </span><span lang="en-us" xml:lang="en-us">datetime </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">datetime
    
    dt = datetime.now()
    date = </span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(dt.date())
    
    
    loglist = []   </span><span lang="en-us" xml:lang="en-us">#   
    </span><span lang="en-us" xml:lang="en-us">iplist = []    </span><span lang="en-us" xml:lang="en-us">#   ip</span>统计<span lang="en-us" xml:lang="en-us">
    </span><span lang="en-us" xml:lang="en-us">urllist = []   </span><span lang="en-us" xml:lang="en-us">#    url</span>统计列表<span lang="en-us" xml:lang="en-us">
    </span><span lang="en-us" xml:lang="en-us">needlist = []   </span><span lang="en-us" xml:lang="en-us">#    </span>需要统计的<span lang="en-us" xml:lang="en-us">
    </span><span lang="en-us" xml:lang="en-us">errorlist = []   </span><span lang="en-us" xml:lang="en-us">#    </span>格式错误的列表<span lang="en-us" xml:lang="en-us">
    </span><span lang="en-us" xml:lang="en-us">ipdict,urldict = {},{}  
    
    rizhi = </span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">input</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'</span>请输入要分析的日志文件名<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">))
    
    </span><span lang="en-us" xml:lang="en-us">def </span><span lang="en-us" xml:lang="en-us">find_log():
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>开始解析日志<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">with </span><span lang="en-us" xml:lang="en-us">open</span><span lang="en-us" xml:lang="en-us">(rizhi,</span><span lang="en-us" xml:lang="en-us">'r'</span><span lang="en-us" xml:lang="en-us">,</span><span lang="en-us" xml:lang="en-us">encoding</span><span lang="en-us" xml:lang="en-us">=</span><span lang="en-us" xml:lang="en-us">'UTF-8'</span><span lang="en-us" xml:lang="en-us">,</span><span lang="en-us" xml:lang="en-us">errors</span><span lang="en-us" xml:lang="en-us">=</span><span lang="en-us" xml:lang="en-us">'ignore'</span><span lang="en-us" xml:lang="en-us">) </span><span lang="en-us" xml:lang="en-us">as </span><span lang="en-us" xml:lang="en-us">f:
            </span><span lang="en-us" xml:lang="en-us">#loglist = f.readlines()
            </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">f.readlines():   </span><span lang="en-us" xml:lang="en-us">#
                </span><span lang="en-us" xml:lang="en-us">if </span><span lang="en-us" xml:lang="en-us">i[</span><span lang="en-us" xml:lang="en-us">0</span><span lang="en-us" xml:lang="en-us">] != </span><span lang="en-us" xml:lang="en-us">'#'</span><span lang="en-us" xml:lang="en-us">:
                    b = re.split(</span><span lang="en-us" xml:lang="en-us">' '</span><span lang="en-us" xml:lang="en-us">,i)
                    iplist.append(b[</span><span lang="en-us" xml:lang="en-us">10</span><span lang="en-us" xml:lang="en-us">])
                    urllist.append(b[</span><span lang="en-us" xml:lang="en-us">6</span><span lang="en-us" xml:lang="en-us">])
                    </span><span lang="en-us" xml:lang="en-us">try</span><span lang="en-us" xml:lang="en-us">:
                        needlist.append([b[</span><span lang="en-us" xml:lang="en-us">10</span><span lang="en-us" xml:lang="en-us">],b[</span><span lang="en-us" xml:lang="en-us">1</span><span lang="en-us" xml:lang="en-us">],b[</span><span lang="en-us" xml:lang="en-us">5</span><span lang="en-us" xml:lang="en-us">],b[</span><span lang="en-us" xml:lang="en-us">6</span><span lang="en-us" xml:lang="en-us">],b[</span><span lang="en-us" xml:lang="en-us">15</span><span lang="en-us" xml:lang="en-us">]])
                    </span><span lang="en-us" xml:lang="en-us">except</span><span lang="en-us" xml:lang="en-us">:
                        errorlist.append(i)
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>日志解析完毕<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
    
    </span><span lang="en-us" xml:lang="en-us">def </span><span lang="en-us" xml:lang="en-us">count(iplist,urllist):    </span><span lang="en-us" xml:lang="en-us">#</span>统计<span lang="en-us" xml:lang="en-us">ip url</span>访问量函数<span lang="en-us" xml:lang="en-us">
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>开始分析<span lang="en-us" xml:lang="en-us">url</span>与<span lang="en-us" xml:lang="en-us">ip</span>访问量<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">global </span><span lang="en-us" xml:lang="en-us">ipdict,urldict
        </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">set</span><span lang="en-us" xml:lang="en-us">(iplist):
            ipdict[i] = iplist.count(i)
        </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">set</span><span lang="en-us" xml:lang="en-us">(urllist):
            urldict[i] = urllist.count(i)
    
        ipdict = </span><span lang="en-us" xml:lang="en-us">sorted</span><span lang="en-us" xml:lang="en-us">(ipdict.items(),</span><span lang="en-us" xml:lang="en-us">key</span><span lang="en-us" xml:lang="en-us">=</span><span lang="en-us" xml:lang="en-us">lambda </span><span lang="en-us" xml:lang="en-us">d: d[</span><span lang="en-us" xml:lang="en-us">1</span><span lang="en-us" xml:lang="en-us">], </span><span lang="en-us" xml:lang="en-us">reverse</span><span lang="en-us" xml:lang="en-us">=</span><span lang="en-us" xml:lang="en-us">True</span><span lang="en-us" xml:lang="en-us">)    
        urldict = </span><span lang="en-us" xml:lang="en-us">sorted</span><span lang="en-us" xml:lang="en-us">(urldict.items(),</span><span lang="en-us" xml:lang="en-us">key</span><span lang="en-us" xml:lang="en-us">=</span><span lang="en-us" xml:lang="en-us">lambda </span><span lang="en-us" xml:lang="en-us">d: d[</span><span lang="en-us" xml:lang="en-us">1</span><span lang="en-us" xml:lang="en-us">], </span><span lang="en-us" xml:lang="en-us">reverse</span><span lang="en-us" xml:lang="en-us">=</span><span lang="en-us" xml:lang="en-us">True</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">type</span><span lang="en-us" xml:lang="en-us">(urldict))
        iplist = </span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(ipdict)
        urllist = </span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(urldict)
        ipdict,urldict = {},{}
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;url</span>与<span lang="en-us" xml:lang="en-us">ip</span>分析完毕<span lang="en-us" xml:lang="en-us">.......'</span><span lang="en-us" xml:lang="en-us">)
    
        </span><span lang="en-us" xml:lang="en-us">return </span><span lang="en-us" xml:lang="en-us">[iplist,urllist]
    
    </span><span lang="en-us" xml:lang="en-us">def </span><span lang="en-us" xml:lang="en-us">save_count():
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>正在保存分析结果<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
        ipname = </span><span lang="en-us" xml:lang="en-us">'ip-'</span><span lang="en-us" xml:lang="en-us">+date+</span><span lang="en-us" xml:lang="en-us">'.txt'
        </span><span lang="en-us" xml:lang="en-us">urlname = </span><span lang="en-us" xml:lang="en-us">'url-'</span><span lang="en-us" xml:lang="en-us">+date+</span><span lang="en-us" xml:lang="en-us">'.txt'
        </span><span lang="en-us" xml:lang="en-us">with </span><span lang="en-us" xml:lang="en-us">open</span><span lang="en-us" xml:lang="en-us">(ipname,</span><span lang="en-us" xml:lang="en-us">'w'</span><span lang="en-us" xml:lang="en-us">) </span><span lang="en-us" xml:lang="en-us">as </span><span lang="en-us" xml:lang="en-us">f:
            </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">iplist:
                f.write(</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(i))+</span><span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">\n</span><span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">with </span><span lang="en-us" xml:lang="en-us">open</span><span lang="en-us" xml:lang="en-us">(urlname,</span><span lang="en-us" xml:lang="en-us">'w'</span><span lang="en-us" xml:lang="en-us">) </span><span lang="en-us" xml:lang="en-us">as </span><span lang="en-us" xml:lang="en-us">f:
            </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">urllist:
                f.write(</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(i))+</span><span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">\n</span><span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>分析结果保存完毕<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
    find_log()
    [iplist,urllist] = count(iplist,urllist)
    save_count()</span>

    iis日志和apache日志觉得都差不多,就是切割时候改一下就行了。

    Iis日志大概是这样的,用pythonreadlines然后切割出来就好了。

    这个url.py我加了个功能把ip访问量及url访问量排序输出出来所以有点慢,=.=没办法野路子哪里会什么算法。将地址,时间,ip,状态码都扔进一个列表里就行了。

    攻击分析模块attack.py

    <span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">os
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">sys
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">url
    
    sqllist,xsslist,senlist = [],[],[]
    otherurl,xssip,sqlip,senip = [],[],[],[]
    feifa = [] 
    </span><span lang="en-us" xml:lang="en-us">def </span><span lang="en-us" xml:lang="en-us">find_attack(needlist):
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>开始检测攻击<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
        sql = </span><span lang="en-us" xml:lang="en-us">r'product.php|preg_\w+|execute|echo|print|print_r|var_dump|(fp)open|^eval$|file_get_contents|include|require|require_once|shell_exec|phpinfo|system|passthru|\(?:define|base64_decode\(|group\s+by.+\(|%20or%20|%20and%20|sleep|delay|nvarchar|exec|union|^select$|version|insert|information_schema|chr\(|concat|%bf|sleep\((\s*)(\d*)(\s*)\)|current|having|database'
        </span><span lang="en-us" xml:lang="en-us">xss = </span><span lang="en-us" xml:lang="en-us">r'alert|^script$|&lt;|&gt;|%3E|%3c|&amp;#x3E|\u003c|\u003e|&amp;#x'
        </span><span lang="en-us" xml:lang="en-us">sen = </span><span lang="en-us" xml:lang="en-us">r'\.{2,}|%2e{2,}|%252e{2,}|%uff0e{2,}0x2e{2,}|\./|\{FILE\}|%00+|json|\.shtml|\.pl|\.sh|\.do|\.action|zabbix|phpinfo|/var/|/opt/|/local/|/etc|/apache/|\.log|invest\b|\.xml|apple-touch-icon-152x152|\.zip|\.rar|\.asp\b|\.php|\.bak|\.tar\.gz|\bphpmyadmin\b|admin|\.exe|\.7z|\.zip|\battachments\b|\bupimg\b|uploadfiles|templets|template|data\b|forumdata|includes|cache|jmxinvokerservlet|vhost|bbs|host|wwwroot|\bsite\b|root|hytop|flashfxp|bak|old|mdb|sql|backup|^java$|class'
        
        </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">needlist:
            </span><span lang="en-us" xml:lang="en-us">if </span><span lang="en-us" xml:lang="en-us">i[</span><span lang="en-us" xml:lang="en-us">2</span><span lang="en-us" xml:lang="en-us">] == </span><span lang="en-us" xml:lang="en-us">'POST' </span><span lang="en-us" xml:lang="en-us">or </span><span lang="en-us" xml:lang="en-us">i[</span><span lang="en-us" xml:lang="en-us">2</span><span lang="en-us" xml:lang="en-us">] == </span><span lang="en-us" xml:lang="en-us">'HEAD' </span><span lang="en-us" xml:lang="en-us">or </span><span lang="en-us" xml:lang="en-us">i[</span><span lang="en-us" xml:lang="en-us">2</span><span lang="en-us" xml:lang="en-us">] == </span><span lang="en-us" xml:lang="en-us">'GET'</span><span lang="en-us" xml:lang="en-us">:
                response = re.findall(sql,i[</span><span lang="en-us" xml:lang="en-us">3</span><span lang="en-us" xml:lang="en-us">],re.I)
                </span><span lang="en-us" xml:lang="en-us">if </span><span lang="en-us" xml:lang="en-us">response == []:
                    responsexss = re.findall(xss,i[</span><span lang="en-us" xml:lang="en-us">3</span><span lang="en-us" xml:lang="en-us">],re.I)
                    </span><span lang="en-us" xml:lang="en-us">if </span><span lang="en-us" xml:lang="en-us">responsexss == []:
                        responsesen = re.findall(sen,i[</span><span lang="en-us" xml:lang="en-us">3</span><span lang="en-us" xml:lang="en-us">],re.I)
                        </span><span lang="en-us" xml:lang="en-us">if </span><span lang="en-us" xml:lang="en-us">responsesen == []:
                            otherurl.append(i)
                        </span><span lang="en-us" xml:lang="en-us">else</span><span lang="en-us" xml:lang="en-us">:
                            senlist.append(i)
                            senip.append(i[</span><span lang="en-us" xml:lang="en-us">0</span><span lang="en-us" xml:lang="en-us">])
                            </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(responsesen)
                            </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'</span>检测出敏感目录扫描<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
                            </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(i)
                    </span><span lang="en-us" xml:lang="en-us">else</span><span lang="en-us" xml:lang="en-us">:
                        xsslist.append(i)
                        xssip.append(i[</span><span lang="en-us" xml:lang="en-us">0</span><span lang="en-us" xml:lang="en-us">])
                        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(responsexss)
                        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'</span>检测出<span lang="en-us" xml:lang="en-us">xss</span>攻击<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
                        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(i)
                </span><span lang="en-us" xml:lang="en-us">else</span><span lang="en-us" xml:lang="en-us">:
                    sqllist.append(i)
                    sqlip.append(i[</span><span lang="en-us" xml:lang="en-us">0</span><span lang="en-us" xml:lang="en-us">])
                    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(responsexss)
                    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'</span>检测出<span lang="en-us" xml:lang="en-us">sql</span>攻击<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
                    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(i)
            </span><span lang="en-us" xml:lang="en-us">else</span><span lang="en-us" xml:lang="en-us">:
                feifa.append(i[</span><span lang="en-us" xml:lang="en-us">0</span><span lang="en-us" xml:lang="en-us">])
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'</span>非法请求<span lang="en-us" xml:lang="en-us">:'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(feifa))+</span><span lang="en-us" xml:lang="en-us">'</span>次<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">set</span><span lang="en-us" xml:lang="en-us">(feifa))))+</span><span lang="en-us" xml:lang="en-us">'</span>个<span lang="en-us" xml:lang="en-us">ip'</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>攻击检测完毕<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">return </span><span lang="en-us" xml:lang="en-us">[xssip,sqlip,senip,sqllist,xsslist,senlist,otherurl]
    
    </span>这个就简单多了,基于正则分析的正则不是很完善,还有好多是根据自己公司情况来定,大牛轻喷,检索完毕返回<span lang="en-us" xml:lang="en-us">ip</span>及<span lang="en-us" xml:lang="en-us">url</span>。

    IP地理位置查询模块ipfind.py

    Ipfind.py是查找ip地理位置的

    <span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">re
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">urllib.request
    
    </span><span lang="en-us" xml:lang="en-us">def </span><span lang="en-us" xml:lang="en-us">url_open(ip):
        url = </span><span lang="en-us" xml:lang="en-us">'http://www.ip138.com/ips138.asp?ip='</span><span lang="en-us" xml:lang="en-us">+ip
        response = urllib.request.urlopen(url)
        html = response.read().decode(</span><span lang="en-us" xml:lang="en-us">'gb2312'</span><span lang="en-us" xml:lang="en-us">)
        </span><span lang="en-us" xml:lang="en-us">return </span><span lang="en-us" xml:lang="en-us">html
    
    
    </span><span lang="en-us" xml:lang="en-us">def </span><span lang="en-us" xml:lang="en-us">find_ip(html):
        a = </span><span lang="en-us" xml:lang="en-us">r'</span>本站数据<span lang="en-us" xml:lang="en-us">.{20,}&lt;/li&gt;'
        </span><span lang="en-us" xml:lang="en-us">p = re.compile(a,re.I)
        response = re.findall(p,html)
        </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">response:
            b = i
        response = re.split(</span><span lang="en-us" xml:lang="en-us">r'&lt;/li&gt;&lt;li&gt;'</span><span lang="en-us" xml:lang="en-us">,b)
        ipaddrs = </span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(response[</span><span lang="en-us" xml:lang="en-us">0</span><span lang="en-us" xml:lang="en-us">][</span><span lang="en-us" xml:lang="en-us">5</span><span lang="en-us" xml:lang="en-us">:])+</span><span lang="en-us" xml:lang="en-us">','</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(response[</span><span lang="en-us" xml:lang="en-us">1</span><span lang="en-us" xml:lang="en-us">][</span><span lang="en-us" xml:lang="en-us">6</span><span lang="en-us" xml:lang="en-us">:])+</span><span lang="en-us" xml:lang="en-us">','</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(response[</span><span lang="en-us" xml:lang="en-us">2</span><span lang="en-us" xml:lang="en-us">][</span><span lang="en-us" xml:lang="en-us">6</span><span lang="en-us" xml:lang="en-us">:-</span><span lang="en-us" xml:lang="en-us">5</span><span lang="en-us" xml:lang="en-us">])
        </span><span lang="en-us" xml:lang="en-us">return </span><span lang="en-us" xml:lang="en-us">ipaddrs
    
    
    </span><span lang="en-us" xml:lang="en-us">def </span><span lang="en-us" xml:lang="en-us">find_ipaddrs(ip):
        
        html = url_open(ip)
        ipaddrs = find_ip(html)
    
        </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(ip+</span><span lang="en-us" xml:lang="en-us">' : '</span><span lang="en-us" xml:lang="en-us">+ipaddrs)
        </span>这个简单我是直接像爬虫那样写的,用<span lang="en-us" xml:lang="en-us">ip138</span>的网址<span lang="en-us" xml:lang="en-us">(</span>接口没有找到,百度注册了好几次都不成功,有<span lang="en-us" xml:lang="en-us">api</span>的可以用<span lang="en-us" xml:lang="en-us">api)</span>。

    主函数

    主函数main.py

    <span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">re
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">os
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">sys
    </span><span lang="en-us" xml:lang="en-us">from </span><span lang="en-us" xml:lang="en-us">datetime </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">datetime
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">url
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">attack
    </span><span lang="en-us" xml:lang="en-us">import </span><span lang="en-us" xml:lang="en-us">ipfind
    
    
    
    needlist = url.needlist
    sqllist,xsslist,senlist = [],[],[]
    otherurl,iplist = [],[]
    
    [xssip,sqlip,senip,sqllist,xsslist,senlist,otherurl]=attack.find_attack(needlist)
    xssip = </span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">set</span><span lang="en-us" xml:lang="en-us">(xssip))
    sqlip = </span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">set</span><span lang="en-us" xml:lang="en-us">(sqlip))
    senip = </span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">set</span><span lang="en-us" xml:lang="en-us">(senip))
    
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>检测出<span lang="en-us" xml:lang="en-us">xss</span>攻击<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(xsslist))+</span><span lang="en-us" xml:lang="en-us">'</span>次<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">'</span>共计<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(xssip))+</span><span lang="en-us" xml:lang="en-us">'</span>个<span lang="en-us" xml:lang="en-us">ip'</span><span lang="en-us" xml:lang="en-us">)
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(xssip)
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>检测出<span lang="en-us" xml:lang="en-us">sql</span>攻击<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(sqllist))+</span><span lang="en-us" xml:lang="en-us">'</span>次<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">'</span>共计<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(sqlip))+</span><span lang="en-us" xml:lang="en-us">'</span>个<span lang="en-us" xml:lang="en-us">ip'</span><span lang="en-us" xml:lang="en-us">)
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(sqlip)
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'&gt;&gt;&gt;&gt;&gt;&gt;&gt;</span>检测出敏感目录扫描<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(senlist))+</span><span lang="en-us" xml:lang="en-us">'</span>次<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">'</span>共计<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">+</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(senip))+</span><span lang="en-us" xml:lang="en-us">'</span>个<span lang="en-us" xml:lang="en-us">ip'</span><span lang="en-us" xml:lang="en-us">)
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(senip)
    iplist = </span><span lang="en-us" xml:lang="en-us">list</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">set</span><span lang="en-us" xml:lang="en-us">(xssip+sqlip+senip))
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">len</span><span lang="en-us" xml:lang="en-us">(iplist))
    
    </span><span lang="en-us" xml:lang="en-us">print</span><span lang="en-us" xml:lang="en-us">(</span><span lang="en-us" xml:lang="en-us">'</span>开始分析<span lang="en-us" xml:lang="en-us">ip</span>地理位置<span lang="en-us" xml:lang="en-us">'</span><span lang="en-us" xml:lang="en-us">)
    </span><span lang="en-us" xml:lang="en-us">for </span><span lang="en-us" xml:lang="en-us">i </span><span lang="en-us" xml:lang="en-us">in </span><span lang="en-us" xml:lang="en-us">iplist:
        ipfind.find_ipaddrs(</span><span lang="en-us" xml:lang="en-us">str</span><span lang="en-us" xml:lang="en-us">(i))</span>

    要分析什么就把需要分析的和main.py放在一个目录下就行了

    总结

    脚本大概说了一遍,说说不足及怎么分析吧。

    现实不足:看了差不多有3个月日志了吧,先说一个最严重的问题,post请求data看不见,本身日志就看不到data,何况等到https普及了日志什么样也不知道,要是有能力最好做成和waf联动那样的。还有就是未知威胁从waf来看基于正则,基于关键词有很多都是能绕过的,但是有攻击行为没删日志的话,肯定是会留下攻击痕迹的,这样可以从检测出来的ip来看具体攻击的url,而未知威胁则不同了,就好比一个0day,攻击waf没用了,日志分析看不出来了,那么只能依靠应急响应以及服务器的报警了。

    还有好多攻击类型没有加入到里面,后期打算把判断攻击类型写成函数,拿if,else判断,类型少还可以,类型多了感觉很容易乱,还有user-agent的收集与判断(虽然大多数扫描器都能改user-agent)。

    具体分析:我都是用脚本跑一遍,然后按ip来看会比较方便些,而这里缺少机器识别,我单独写了一个简易的机器识别的东西,其实要实现很简单,把全部日志按时间,url,ip扔进一个列表里统计一下相同时间相同ip的就可以了。我写的是识别短信轰炸的,后期还会渐渐的完善,如果有能力就把它结合django来弄成图形化,毕竟脚本始终是脚本,终究听着不好听。

    效果如下

    具体规则有待完善。



沪ICP备19023445号-2号
友情链接