How SparkFun (and 800.com) and small commerce or startup websites can scale
Somewhere in late 1997, early 1998 (as I recall) I was working at a place called 800.com. Me and three guys (Joe Tillotson, Javan Smith and Patrick Cauldwell) from our consulting firm were helping create the storefront, shopping cart and business systems. We did it all in COM and Classic ASP. I did most of the front-end ASP, HTML, cut graphics, did scale, browser compat, the whole thing. Javan ended up working there full time for some years as I recall.
It was deep in the beginning of the first boom. We were sleeping with our servers. The place had no offices yet, and we were sitting on the stairs of the Tyee Group in Portland. March 1, 2002 800.com sold to Circuit City and disappeared. I still say they could have been like Amazon, but it just didn't work out.
Anyway, the CEO Greg Drew and his folks had this crazy idea for a promotion. It was simple, 3 DVDs for $1. That's it. Free shipping and everything. It was insane.
In the same vein of crazy promotions, SparkFun Electronics had "Free Day" today. It was also a brilliant promotional idea. They offered $100 per person, until they hit $100,000. Of course, that's only 1000 people but you don't say that part. ;) Every geek on the planet had their finger on the "Buy" button and word spread like wildfire on Twitter. The whole thing was over in 1h 44m 50s, according to their website. I spent an hour trying to check out my cart, and only succeeded in loading a page twice. Bummer.
With 800.com, our 3 DVDs for $1 promotion lasted longer, a few days as I recall, and word travelled much slower, on USENet and Email. Barbarians. ;)
Here's some of the things we did to optimize the 800.com site for this massive traffic surge in that pre-cloud era. It worked for us. Perhaps SparkFun did, or could have, done similar things.
Think about the Ratio of Reads to Writes
When people visit your site, are they mostly reading data, like browsing a product catalog? Or are they mostly writing data, like putting something in a shopping cart and trying to check out?
There's basically these four types of data on a web site: resource or catalog data, infrequently-used reference data, user session data, and clickstream or activity oriented data. Each is accessed in a different way, and you can dramatically change the way your site behaves when you have crazy promotions.
When 800.com started our promotion, our sites dramatically switched from 95% of people browsing and 5% of people checking out, to 95% percent of people checking out and only 5% browsing. When the sites fundamental ratio changed we had to change the site. We hadn't designed for this!
For the short term we started by turn the product catalog pages that were the most frequently visited into static pages. Don't go poo-poo that power of the static page. For small companies with just a few web servers there's little faster than serving a static page.
Cache Everything (Resources and Reference Data)
Memory is exceedingly cheap, and you really can't have too much. Additionally if you're a small company with a small site, chances are your product catalog isn't that large. I would be surprised if most medium sized, catalogs can be kept in memory. Additionally, the data in a product catalog using change very often, so it can be cached certainly for hours at a time, if not longer. After our short-term static files solution, we moved to just caching the entire catalog in memory. When all you need to do is show products and categories, caching is your friend.
Move Images Somewhere Else (Make it someone else's job)
We learned early on that the pages that the web servers that were doing the work really didn't need to be worrying about serving images, so we put those on http://images.800.com. This might seem obvious now in 2010, but it was pretty cool thinking in 1997. This allowed us to put all of our product images on a separate server farm and scale them out differently.
Today a lot of people put their images up on S3 at Amazon or other cloud services. For image heavy sites like product catalogs, moving them not only saves you time, effort and hardware, but it can also save bandwidth.
Partition Your Responsibilities (Break it up)
Even though you might have a single web server, or you might have six, think about partitioning responsibility early on, with separate mini-sites for things like images, profiles, web services, product catalogs, shopping cart, and check-out. If your site's characteristics change quickly and and you have to scale out and add new web servers, you can even create separate web farms for these mini-sites just checking out or just images. This can be as easy as having a separate virtual directory, and treating it as an application.
Embrace Stale Data (Realtime is a Lie)
One of the things we had at 800.com that was kind of revolutionary at that time was an AS/400 inventory system that we hooked up directly to the web. You could see a real time inventory of any product even to the point where you could hit refresh and see products being sold. Remember that this was 1997. In this case, it was silly of us to plug the AS/400 directly into the web server, so eventually we built an intermediate database. But the real root issue was that we realized what "realtime" meant to our business. The president of the company said he wanted realtime inventory, so we assumed he meant it. But that didn't mean we needed to go back to the inventory system every second. In fact, when pressed, we learned that realtime to our president could be within 5 or 10 minutes.
Updating data every 10 minutes is infinitely easier than updating it in real time as you might guess. Ask yourself, if you've got realtime data on an otherwise mostly static page, how stale can my data be?
Thanks for indulging me on this trip down memory lane. Certainly, a lot of this stuff is obvious in 2010, and not just obvious but required in large enterprise systems, but these basic principles still apply today for small businesses running a relatively small web sites on, say, less than 10 servers.
Congratulations to SparkFun for their successful promotion. Even though I was unable to visit their site for two hours, the buzz they generated on twitter is no doubt invaluable.
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
We actually had a telnet store before a website (yes they did close this eventually). The first website started up a separate process to handle each request! (this was no lightweight process either, the stripped executable was over a meg!). One of the founders created what we'd call today a scripting language that was used to add dynamic content to the HTML.
We did things like making high traffic pages static, making the application server re-entrant so a single process invocation could serve several requests, switching from a record oriented B+Tree database to Oracle so we could run multiple web-servers against the same back-end, serving images from different servers than content, serving audio samples from different servers and many other tricks I'm sure I've forgotten.
Of necessity inventory data was decoupled from the website since I believe we used to periodically FTP it from our drop shipper. Ditto for the shopping cart since, in the very beginning, they used to manually fax the orders to the drop shipper!
Like so many things a lot of the optimizations ended up bringing us back to first principles - a loosely coupled architecture that let you scale out at each layer.
Scott, I think you might want to proof read your cache paragraph I think the third sentence says the opposite of what you intended. Same with the section on the AS400, I think you were seeing products sold, not profits, but then maybe it was profits that were being sold at the cheap 800.com prices ;)
Comments are closed.