๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

๋ชฉ๋ก์ด ์—†์Šต๋‹ˆ๋‹ค.

[Apache] ์•…์„ฑ ๋ด‡ ์ฐจ๋‹จ (ํŠน์ • User-Agent ์ ‘๊ทผ ์ฐจ๋‹จ)

Server
    ๋ฐ˜์‘ํ˜•
    ๐Ÿ’ก Apache Web Server ์•…์„ฑ ๋ด‡ ์ฐจ๋‹จ ๋ฐฉ๋ฒ• ์ •๋ฆฌ
    apache confํŒŒ์ผ ๋””๋ ‰ํ† ๋ฆฌ ๋ฐ ํŒŒ์ผ ๊ตฌ์กฐ๋Š” virtual host ์—ฌ๋Ÿฌ๊ฐœ ๋ฐ proxy pass๋กœ tomcat์„ reverseProxy๋กœ ์„ค์ •ํ•œ ์ƒํ™ฉ

     

    ์•…์„ฑ ๋ด‡ ๋ชฉ๋ก

    ์•…์„ฑ ๋ด‡ ๋ชฉ๋ก์€ ์ธํ„ฐ๋„ท์— ๋Œ์•„๋‹ค๋‹ˆ๋Š” ๊ฒƒ์„ ์ค์ค ํ–ˆ์Šต๋‹ˆ๋‹ค.  ์—ฌ๊ธฐ์—์„œ ๋‹ค์šด๋กœ๋“œ ๋ฐ›์•„์ฃผ์„ธ์š” ๐Ÿ™Œ

    ์•…์„ฑ ๋ด‡ ๋ชฉ๋ก์„ ์„ค์ •ํŒŒ์ผ์—์„œ ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋Š” ์ ๋‹นํ•œ ๊ฒฝ๋กœ์— ๋งŒ๋“ค์–ด์ค๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” /etc/apache/sites-available/extra/bad_bot.conf ๊ฒฝ๋กœ์— ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

     

    /etc/apache2/sites-available/extra/bad_bot.conf ํŒŒ์ผ ๋‚ด์šฉ
    
    # SetEnvIfNoCase User-Agent ^$ bad_bot
    SetEnvIfNoCase User-Agent "^MJ12bot" bad_bot
    SetEnvIfNoCase User-Agent "^MJ12bot/v1.4.5" bad_bot
     
     
    #์•…์„ฑ๋ด‡...
    SetEnvIfNoCase User-Agent "SemrushBot" bad_bot          #181203
    SetEnvIfNoCase User-Agent "SemrushBot-SA" bad_bot       #181203
    SetEnvIfNoCase User-Agent "DomainCrawler" bad_bot       #181210
    ...
    ์—ฌ๊ธฐ์„œ ํ™•์žฅ์ž๋Š” ๊ผญ conf ํŒŒ์ผ์ด ์•„๋‹ˆ์–ด๋„ ์ƒ๊ด€์—†์Šต๋‹ˆ๋‹ค.

     

    ์ผ๊ด„ ์ฐจ๋‹จ ์„ค์ • ๋ฐฉ๋ฒ•

    VirtualHost๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ƒํ™ฉ์—์„œ ์—ฌ๋Ÿฌ conf ํŒŒ์ผ์ด ๋‚˜๋ˆ ์ ธ ์žˆ์—ˆ๋Š”๋ฐ ๋งค VirtualHost ๋งˆ๋‹ค User Agent ์ฐจ๋‹จ์„ ์„ค์ •ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค๋Š” ์ตœ์ƒ์œ„ ์„ค์ • ํŒŒ์ผ์—์„œ ํ•œ๋ฒˆ๋งŒ ์„ค์ •ํ•˜๋„๋ก ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ •๋ฆฌํ•ด๋ด…๋‹ˆ๋‹ค.

    ์ €๋Š” ์ตœ์ƒ์œ„ ์„ค์ •ํŒŒ์ผ์„ apache ์„ค์ • ์‹œ ๊ธฐ๋ณธ์œผ๋กœ ์žˆ๋Š” /etc/apache2/sites-available/000-default.conf ํŒŒ์ผ์— ์„ค์ •ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

    /etc/apache2/sites-available/000-default.conf ํŒŒ์ผ ๋‚ด์šฉ
    
    <Location />
    	# ์•„๊นŒ ์ž‘์„ฑํ•ด๋‘” ์•…์„ฑ ๋ด‡ ๋ชฉ๋ก
        Include /etc/apache2/sites-available/extra/bad_bot.conf
        Order Allow,Deny
        Allow from all
        Deny from env=bad_bot
    </Location>

     

    ํŠน์ • User-Agent๊ฐ€ ์ž˜ ์ฐจ๋‹จ๋˜๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๋ฐฉ๋ฒ•

    $ curl -I -A "SlackBot" http://www.test.com
    $ curl -H "User-Agent: SemrushBot" https://www.test.com

     

    ์‹คํ–‰ ๊ฒฐ๊ณผ

    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <html><head>
    <title>403 Forbidden</title>
    </head><body>
    <h1>Forbidden</h1>
    <p>You don't have permission to access this resource.</p>
    <hr>
    <address>Apache/2.4.41 (Ubuntu) Server at www.test.com Port 443</address>
    </body></html>

     

    ์ด๋ ‡๊ฒŒ 403์œผ๋กœ HTTP ์ƒํƒœ์ฝ”๋“œ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋ฉด ์ฐจ๋‹จ์ด ๋œ ๊ฒƒ ์ž…๋‹ˆ๋‹ค. ๋น„๋ก apache์˜ access log์— ๋‚จ๊ธฐ๋Š” ํ•˜์ง€๋งŒ,

    ์š”์ฒญ ์ž์ฒด๋Š” ๋˜์ง€๋งŒ, ๊ถŒํ•œ ์—†์Œ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜์—ฌ ์›น ํŽ˜์ด์ง€ ๋ฆฌ์†Œ์Šค๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋Š” ํŠธ๋ž˜ํ”ฝ์„ ๋ง‰์Œ์œผ๋กœ์จ ์•…์„ฑ๋ด‡์œผ๋กœ ์ธํ•œ ๋ถˆํ•„์š”ํ•˜๊ณ  ๊ณผ๋„ํ•œ ํŠธ๋ž˜ํ”ฝ์„ ํ˜„์ €ํžˆ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค! ๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘

     

     

    ์ฐธ๊ณ 

    Apache bad_bot.conf ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•: https://hoing.io/archives/398

    Apache confํŒŒ์ผ์˜ Location์œผ๋กœ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•: https://gromet.tistory.com/463

    ์•…์„ฑ bot ๋ชฉ๋ก: https://hoing.io/storage/1/9890057497.txt

     

     

    ๋ฐ˜์‘ํ˜•