Blocking Image Hotlinking, Leeching and Evil Sploggers with IIS Url Rewrite
I recently discovered that a blog called (seriously) "Google Chrome Browser" was reblogging my site. (It of course has NO relationship to Google or the lovely folks on the Chrome team.)
This is a splog or "spam blog." It's less of a blog and more of a 'suck your feed in and reblog it.' Basically every post is duplicated or sucked in via RSS from somewhere else. I get this many times a week and have for years.
However, this particular site started showing up ahead of mine in searches and that's not cool.
Worse yet, they have almost 25k followers on Twitter. I've asked them a few times to stop doing this, but this time I got tired of it.
They're even 'hotlinking' my images, which means that all my PNGs are still hosted on my site. When you visit their site, the text is from my RSS but I pay for the images bandwidth. The irony of this is thick. Not to mention my copyright notice is intact on their site. ;)
When an image is linked to from another domain the HTTP_REFERER header is populated with the location that the image is linked from. That means when my web server gets a request for 'foo.png' from the Google Chrome Browser blog I can see the page that asked for that image.
For example:
Request URL:http://www.hanselman.com/blog/content/binary/Windows-Live-Writer/How-to-run-a-Virtual-Conference-for-10_E53C/image_5.png
Request Method:GET
Referer:http://google-chrome-browser.com/penny-pinching-cloud-how-run-two-day-virtual-conference-10
Because this differentiates the GET request that means I can do something about it. This brings up a few important things to remember in general about the web that I feel a lot of programmers forget about:
- The Internet is not a black box.
- You can do something about it.
That said, I want to detect these requests and serve a different image.
If I was using Apache and had an .htaccess file, I might do this:
RewriteCond %{HTTP:Referer} ^.*http://(?:www\.)?computersblogsexample.info.*$
RewriteHeader Referer: .* damn\.spammers
RewriteCond %{HTTP:Referer} ^.*http://(?:www\.)?google-chrome-browser.*$
RewriteHeader Referer: .* damn\.spammers
#make more of these for each evil spammer
RewriteCond %{HTTP:Referer} ^.*damn\.spammers.*$
RewriteRule ^.*\.(?:gif|jpg|png)$ /images/splog.png [NC,L]
Since I'm using IIS, I'll do similar rewrites in my web.config. I could do a whitelist where I only allow hotlinking from a few places, or a blacklist where I only block a few folks. Here's a blacklist.
<system.webServer>
<rewrite>
<rules>
<rule name="Blacklist block" stopProcessing="true">
<match url="(?:jpg|jpeg|png|gif|bmp)$" />
<conditions>
<add input="{HTTP_REFERER}" pattern="^https?://(.+?)/.*$" />
<add input="{DomainsBlackList:{C:1}}" pattern="^block$" />
<add input="{REQUEST_FILENAME}" pattern="splog.png" negate="true" />
</conditions>
<action type="Redirect" url="http://www.hanselman.com/images/splog.png" appendQueryString="false" redirectType="Temporary"/>
</rule>
</rules>
<rewriteMaps>
<rewriteMap name="DomainsBlackList" defaultValue="allow">
<add key="google-chrome-browser.com" value="block" />
<add key="www.verybadguy.com" value="block" />
<add key="www.superbadguy.com" value="block" />
</rewriteMap>
</rewriteMaps>
</rewrite>
</system.webServer>
I could have just made a single rule and put this bad domain in it but it would have only worked for one domain, so instead my buddy Ruslan suggested that I make a rewritemap and refer to it from the rule. This way I can add more domains to block as the evil spreads.
It was important to exclude the splog.png file that I am going to redirect the bad guy to, otherwise I'll get into a redirect loop where I redirect requests for the splog.png back to itself!
The result is effective. If you visit their site, I'll issue an HTTP 307 (Moved Temporarily) and then you'll see my splog.png image everywhere that they've hotlinked my image.
If you wanted to change the blacklist to a white list, you'd reverse the values of allow and block in the rewrite map:
<rewriteMaps>
<rewriteMap name="DomainsBlackList" defaultValue="block">
<add key="google-chrome-browser.com" value="allow" />
<add key="www.verybadguy.com" value="allow" />
<add key="www.superbadguy.com" value="allow" />
</rewriteMap>
</rewriteMaps>
Nice, simple and clean. I don't plan on playing "whac a mole" with sploggers as it's a losing game, but I will bring down the ban-hammer on particularly obnoxious examples of content theft, especially when they mess with my Google Juice.
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
But I would have been no so nice in that image with your message.
Tell them what you really think!! LOL
--Ron
I had the same problem several years ago, but I used an HttpModule with a wildcard (IIS6).
"Block that..." doesn't punish the offender in any way, shape, or form - and runs the risk of ruining the experience of people who actually visit your site.
Here's the issue. Anyone can link to your content. There is no way to prevent that. But you can display a message saying "hey, don't do this."
But -- the thing is, they also can go to no-ip and come out with a new DNS. On Blogger or similar platforms, they can sign up for a new blog in minutes.
So I'd go with a white list, instead of a black list.
Maybe you can convert a few of the honest folks who stumble on the pirated material if you make it easy for them to find you?
Needless to say the racing organization was very embarrassed, but he almost got arrested on hacking charges (in the US, he most likely would have)!
http://jurassicpark.wikia.com/wiki/File:YouDidn'tSayTheMagicWord.gif
So i searched for:
"ya buddy"
I got three hits - all yours ;).. and your web.config is working.
I am wondering, would it also make sense to cause a delay when serving the "splog.png" image to make browsing experience on these spam blog sites even more sucky?
Back before the days of image-sharing sites like Imgur, I discovered that a user of mine was uploading images to my forum and then using it as a host for the images in other, much higher traffic forums. This was eating a good chunk of my bandwidth and resources.
My solution was a similar trick to serve up a replacement image for requests that originated from one of those forums (as a blacklist), and since I happened to have a picture of the user, the replacement image was the goatse picture censored with his face, as well as a note about how it's not nice to steal someone else's bandwidth.
Meanwhile, users of my own forum (and elsewhere) still got to see his original images.
In my case above, the person doing the hotlinking was somewhat known to his audience, since it was a forum environment. My use of a semi-evil image was more of a vindictive move to tarnish his reputation on those forums as punishment for using my server as an image host without my consent (to the point where legitimate requests to my server were seeing noticeable delays).
Your message is clear but Do you think owner of that google chrome browser com will bother/mind/aware with this kind of stuff.
What are you going to do then? I am thinking legal actions.
Serving such files using handlers or module and in the handler we can check for Referer and serve with right image.
Do we have any simple process other than this.
Thanks,
Pavan
I think Scott is one of few people remaining who is not obsessed with copyrights issues , we become crazy about any one copy a one word from us and we get screaming "take legal action, call laywer" , even if no real harm causing to us.
no thing require legal action in the end this website is market his work as they are not publishing it under another name , so why would he bother, I usually get happy if other websites publish my blog .
But I wouldn't be so kind w.r.t. the image... Some HD offensive porn is appropriate in this case.
Thank you for the post. I was able to use it for my e-commerce class regarding search engine spam and splogging. I recalled your effective methods for combating such methods. Great posts, keep up the great work.
Thanks,
Dave A.
Nice. Congrats. One question.
I have a case where I don't want to serve my files from my website.
Lets say I have this.
media.domain.com
I want to be able to serve only domain.com (Referer would be domain.com).
Since I am requesting the file from media.domain.com it doesn't have a Referer. So, that situation bypass your code.
I added (^$|^https?://(.+?)/.*$) whether is empty or has something in the referer.
the thing is that C:1 is either null or empty.
I added to my black list a key="" value="block" but it doesn't seem to get it.
How could I solve this issue?
Thanks
Comments are closed.