Scott Hanselman

Blocking Image Hotlinking, Leeching and Evil Sploggers with IIS Url Rewrite

June 03, 2013 Comment on this post [60] Posted in IIS
Sponsored By

I recently discovered that a blog called (seriously) "Google Chrome Browser" was reblogging my site. (It of course has NO relationship to Google or the lovely folks on the Chrome team.)

This is a splog or "spam blog." It's less of a blog and more of a 'suck your feed in and reblog it.' Basically every post is duplicated or sucked in via RSS from somewhere else.  I get this many times a week and have for years.

However, this particular site started showing up ahead of mine in searches and that's not cool.

You evil bastards.

Worse yet, they have almost 25k followers on Twitter. I've asked them a few times to stop doing this, but this time I got tired of it.

They're even 'hotlinking' my images, which means that all my PNGs are still hosted on my site. When you visit their site, the text is from my RSS but I pay for the images bandwidth. The irony of this is thick. Not to mention my copyright notice is intact on their site. ;)

When an image is linked to from another domain the HTTP_REFERER header is populated with the location that the image is linked from. That means when my web server gets a request for 'foo.png' from the Google Chrome Browser blog I can see the page that asked for that image.

For example:

Request URL:http://www.hanselman.com/blog/content/binary/Windows-Live-Writer/How-to-run-a-Virtual-Conference-for-10_E53C/image_5.png
Request Method:GET
Referer:http://google-chrome-browser.com/penny-pinching-cloud-how-run-two-day-virtual-conference-10

Because this differentiates the GET request that means I can do something about it. This brings up a few important things to remember in general about the web that I feel a lot of programmers forget about:

That said, I want to detect these requests and serve a different image.

If I was using Apache and had an .htaccess file, I might do this:

RewriteCond %{HTTP:Referer} ^.*http://(?:www\.)?computersblogsexample.info.*$
RewriteHeader Referer: .* damn\.spammers

RewriteCond %{HTTP:Referer} ^.*http://(?:www\.)?google-chrome-browser.*$
RewriteHeader Referer: .* damn\.spammers

#make more of these for each evil spammer

RewriteCond %{HTTP:Referer} ^.*damn\.spammers.*$
RewriteRule ^.*\.(?:gif|jpg|png)$ /images/splog.png [NC,L]

Since I'm using IIS, I'll do similar rewrites in my web.config. I could do a whitelist where I only allow hotlinking from a few places, or a blacklist where I only block a few folks. Here's a blacklist.

<system.webServer>
<rewrite>
<rules>
<rule name="Blacklist block" stopProcessing="true">
<match url="(?:jpg|jpeg|png|gif|bmp)$" />
<conditions>
<add input="{HTTP_REFERER}" pattern="^https?://(.+?)/.*$" />
<add input="{DomainsBlackList:{C:1}}" pattern="^block$" />
<add input="{REQUEST_FILENAME}" pattern="splog.png" negate="true" />
</conditions>
<action type="Redirect" url="http://www.hanselman.com/images/splog.png" appendQueryString="false" redirectType="Temporary"/>
</rule>
</rules>
<rewriteMaps>
<rewriteMap name="DomainsBlackList" defaultValue="allow">
<add key="google-chrome-browser.com" value="block" />
<add key="www.verybadguy.com" value="block" />
<add key="www.superbadguy.com" value="block" />
</rewriteMap>
</rewriteMaps>
</rewrite>
</system.webServer>

I could have just made a single rule and put this bad domain in it but it would have only worked for one domain, so instead my buddy Ruslan suggested that I make a rewritemap and refer to it from the rule. This way I can add more domains to block as the evil spreads.

It was important to exclude the splog.png file that I am going to redirect the bad guy to, otherwise I'll get into a redirect loop where I redirect requests for the splog.png back to itself!

The result is effective. If you visit their site, I'll issue an HTTP 307 (Moved Temporarily) and then you'll see my splog.png image everywhere that they've hotlinked my image.

Not cool, splogger, not cool.

If you wanted to change the blacklist to a white list, you'd reverse the values of allow and block in the rewrite map:

<rewriteMaps>
<rewriteMap name="DomainsBlackList" defaultValue="block">
<add key="google-chrome-browser.com" value="allow" />
<add key="www.verybadguy.com" value="allow" />
<add key="www.superbadguy.com" value="allow" />
</rewriteMap>
</rewriteMaps>

Nice, simple and clean. I don't plan on playing "whac a mole" with sploggers as it's a losing game, but I will bring down the ban-hammer on particularly obnoxious examples of content theft, especially when they mess with my Google Juice.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook bluesky subscribe
About   Newsletter
Hosting By
Hosted on Linux using .NET in an Azure App Service
June 03, 2013 23:30
Brilliant - It'd be great if we had much more of this, and just had 1 source of truth for a lot of information. I hate going to other sites just to find a badly formatted copy of stackoverflow.com and the such. Keep up the good work!
Ian
June 03, 2013 23:38
That's great. If only you could insert a link to your blog through the image. Content takes time to create and it's too bad that the owner of that site is only concerned with how many hits they get.
June 03, 2013 23:38
This is the sort of thing that search engines should be doing more to address. Ranking and relevance are among the main reasons we rely on search engines, and this is a case where they are clearly getting it wrong.
June 03, 2013 23:41
Now that is the way to show them Scott.

But I would have been no so nice in that image with your message.

Tell them what you really think!! LOL

--Ron
June 03, 2013 23:48
Amazing use of wits!
June 03, 2013 23:50
Yeeaah Boyeeee! Well done Scott. Hope this deters them once and for all.
KR
June 03, 2013 23:54
What you need is a totally mental, trance inducing animated gif. Featuring cats
June 04, 2013 0:00
Ahh! Thanks for this Scott! I've had years of people embedding my flash files on their sites and stealing my bandwidth (let alone copyright infringement), so I'll put this to good use with swf too.
June 04, 2013 0:10
Nice tip Scott.

I had the same problem several years ago, but I used an HttpModule with a wildcard (IIS6).
June 04, 2013 0:17
Nicely done! However it doesn't change the fact that visitors will use your bandwidth to download the splog.png image... perhaps you should host it on imgur.com instead ;)
June 04, 2013 0:18
If you are taking the time to url-rewrite your images for this site, why not just take the time to block that site from reading your RSS feed?
June 04, 2013 0:18
Well done, well done.
Although, you should put some funny animated gif
June 04, 2013 0:20
Checkmate! Who's in control now? Haha
June 04, 2013 0:21
Awesome article.
Thanks for the write-up, Scott.
June 04, 2013 0:44
@Davin

"Block that..." doesn't punish the offender in any way, shape, or form - and runs the risk of ruining the experience of people who actually visit your site.
K
June 04, 2013 1:00
IMO, keep a white list and a black list.

Here's the issue. Anyone can link to your content. There is no way to prevent that. But you can display a message saying "hey, don't do this."

But -- the thing is, they also can go to no-ip and come out with a new DNS. On Blogger or similar platforms, they can sign up for a new blog in minutes.

So I'd go with a white list, instead of a black list.
June 04, 2013 1:08
We used to do that, but found that many browsers block the referrer being transmitted. Also, many web bots/spiders don't send one either. Be careful.
June 04, 2013 1:45
Dave - There are HUNDREDS of obscure Google Reader clones and last time I did that I ended up opening up LOTS domains and angering many.
June 04, 2013 1:50
You should update your substitute image to include your *real* blog's URL.

Maybe you can convert a few of the honest folks who stumble on the pirated material if you make it easy for them to find you?
June 04, 2013 1:51
I saw a guy online that changed all the pictures to ones he could hotlink off porn sites. It was a site about high-end race cars and a very official French racing site was hotlinking his images that he took with his own camera that he was selling on his site.

Needless to say the racing organization was very embarrassed, but he almost got arrested on hacking charges (in the US, he most likely would have)!
June 04, 2013 2:35
Google should provide a well-known image link you could insert instead of your custom message. Then when Google spiders the site, they could detect their image and know that a splogger lives there.
June 04, 2013 3:03
Heh - a few years back, one site I knew had the same problems - he started feeding the old image (pre site sale) from goat.se (NOT safe for work)
June 04, 2013 4:25
White list is a bad idea. Referer can be disabled by browser settings (by privacy-minded people), most proxies strip it, and https:// won't pass it on to http:// links.
June 04, 2013 5:01
You should find someone with the Newman animated graphic from Jurassic Park movie and redirect the requests there. Then they won't be hijacking your bandwidth too :)

http://jurassicpark.wikia.com/wiki/File:YouDidn'tSayTheMagicWord.gif
Jay
June 04, 2013 7:16
Great job! Yeah, this is a horrible problem and one that unfortunately will continue on. Luckily if you're on your own server you can take control over this. More blog services (weblogs.asp.net hint hint) need the ability to enable users to do this.
June 04, 2013 8:03
No animated gif?
June 04, 2013 10:49
very nice article scott.
June 04, 2013 11:17
Another fun idea: https://github.com/kitcambridge/evil.js + https://www.owasp.org/index.php/Script_in_IMG_tags
June 04, 2013 11:22
Sadly, blocked by all modern browsers. Boo.
June 04, 2013 13:42
I wanted to see for myself, went to the site found lots of guff on there, but no Hanselman post..

So i searched for:

"ya buddy"

I got three hits - all yours ;).. and your web.config is working.
June 04, 2013 14:25
Nice - I'm sure you (the real you) blogged about something like this a while ago with regard to IIS Url rewrite rules, or maybe it was just the canonical domain ones etc? I'm glad you found a good way to stop the spam anyhow - damn MOFOS
June 04, 2013 14:54
Brilliant!

I am wondering, would it also make sense to cause a delay when serving the "splog.png" image to make browsing experience on these spam blog sites even more sucky?

June 04, 2013 14:59
Thanks for this great post, Scott. Learning a lot about Internet, the way it works and how to stay safe.
June 04, 2013 16:50
As always, thanks for taking the time and extra effort not to punish those of us using aggregators to read your stuff. With the imminent demise of Google Reader, many of us have had to move to alternates, making your task much harder, unfortunately.
June 04, 2013 18:42
Everyone suggesting evil images are forgetting that the people who see the images aren't the people stealing the content.
June 04, 2013 18:54
Have you sent a takedown notice to their ISP, or is that more work than it's worth?
June 04, 2013 19:12
I hope they will reblog this post...

For the irony !!
Xas
June 04, 2013 20:33
waiting for the irony to unfold when they re-blog this post
June 04, 2013 22:01
I've done something similar in the past, although my replacement image wasn't so nice.

Back before the days of image-sharing sites like Imgur, I discovered that a user of mine was uploading images to my forum and then using it as a host for the images in other, much higher traffic forums. This was eating a good chunk of my bandwidth and resources.

My solution was a similar trick to serve up a replacement image for requests that originated from one of those forums (as a blacklist), and since I happened to have a picture of the user, the replacement image was the goatse picture censored with his face, as well as a note about how it's not nice to steal someone else's bandwidth.

Meanwhile, users of my own forum (and elsewhere) still got to see his original images.
June 04, 2013 22:16
Nick: I agree with you on why not to use evil images.

In my case above, the person doing the hotlinking was somewhat known to his audience, since it was a forum environment. My use of a semi-evil image was more of a vindictive move to tarnish his reputation on those forums as punishment for using my server as an image host without my consent (to the point where legitimate requests to my server were seeing noticeable delays).
June 04, 2013 22:29
Scott Nice trick to trick the spammers
June 04, 2013 22:49
Of course you could also try contacting namecheap support, where the domain is registered or hostnine where the site's dns is pointing or "A Small Orange" where the site seems to be hosted. Maybe one of them would be interested in the abuse case.
June 04, 2013 22:54
Nice article...
Your message is clear but Do you think owner of that google chrome browser com will bother/mind/aware with this kind of stuff.
June 04, 2013 22:55
You're too nice, Scott :3
June 05, 2013 0:15
Mike C: I agree your situation is different, even though your solution is the same. :)
June 05, 2013 0:24
Not only an informative article, but a great example of pique-induced ingenuity!
June 05, 2013 3:29
All is nice and dandy, Scott, until they stop hotlinking the images and copy them locally.

What are you going to do then? I am thinking legal actions.
June 05, 2013 7:07
Good article. Empty http_referer still gets through, but this is as good as it gets. As another reader pointed out a white list might be more effective in that it automatically rejects every one other than empty and your domain.
June 05, 2013 8:37
Hahaha! That's a real Nose-Break! I loved your approach Scott Hanselman. great one. Hope the Sploggers or Snailers are gonna get a damn who're faking to get traffic. Great Job! I appreciate! :-)
June 05, 2013 11:30
Scott as usual interesting article....So my query here is you were making use of rewrite and i think this feature is present only in IIS7...So in case of previous versions one way of handling these type of things is

Serving such files using handlers or module and in the handler we can check for Referer and serve with right image.

Do we have any simple process other than this.

Thanks,
Pavan
June 05, 2013 21:57
@Andrei Rinea:
I think Scott is one of few people remaining who is not obsessed with copyrights issues , we become crazy about any one copy a one word from us and we get screaming "take legal action, call laywer" , even if no real harm causing to us.
no thing require legal action in the end this website is market his work as they are not publishing it under another name , so why would he bother, I usually get happy if other websites publish my blog .
Sam
June 05, 2013 22:59
It is interesting that they appeared higher than you in search results because you have the rel author tag set in your google+ link so you would have hoped that google would have taken some notice of that!
June 06, 2013 0:20
Nicely done!!!
June 06, 2013 14:24
Brilliant!

But I wouldn't be so kind w.r.t. the image... Some HD offensive porn is appropriate in this case.
June 06, 2013 17:43
What about inserting a link to your feed articles, I've seen other people do it and it would prevent them from messing with your "Google juice", while it wouldn't be annoying for people reading your RSS.
June 06, 2013 20:14
Hope they'll reblog this article as well! :)
June 07, 2013 20:37
Scott, it's been a few days now. You should have some metrics. Time to post an update with the results of your efforts!
June 13, 2013 6:42
It wouldnt be such an issue if the search engines policed this themselves. But they wont, because they are the ones making money on the ad space....
October 01, 2013 2:39
Scott,

Thank you for the post. I was able to use it for my e-commerce class regarding search engine spam and splogging. I recalled your effective methods for combating such methods. Great posts, keep up the great work.

Thanks,

Dave A.
November 25, 2013 8:44
Scott,

Nice. Congrats. One question.

I have a case where I don't want to serve my files from my website.

Lets say I have this.

media.domain.com

I want to be able to serve only domain.com (Referer would be domain.com).

Since I am requesting the file from media.domain.com it doesn't have a Referer. So, that situation bypass your code.

I added (^$|^https?://(.+?)/.*$) whether is empty or has something in the referer.

the thing is that C:1 is either null or empty.

I added to my black list a key="" value="block" but it doesn't seem to get it.

How could I solve this issue?

Thanks

Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.