Introducing Windows Azure WebJobs
I'm currently running 16 web sites on Windows Azure. I have a few Virtual Machines, but I prefer to run things using "platform as a service" where I don't have to sweat the underlying Virtual Machine. That means, while I know I can run a Virtual Machine and put "cron" jobs on it, I'm less likely to because I don't want to mess with VMs or Worker Roles.
There are a few ways to run stuff on Azure, first, there's IAAS (Infrastructure as a Service) which is VMs. Then there's Cloud Applications (Cloud Services) where you can run anything in an Azure-managed VM. It's still a VM, but you have a lot of choice and can run Worker Roles and background stuff. However, there's a lot of ceremony if you just want to run your small "job" either on a regular basis or via a trigger.
Looking at this differently, platform as a service is like having your hotel room fixed up daily, while VMs is more like managing a house yourself.
As someone who likes to torch a hotel room as much as the next person, this is why I like Azure Web Sites (PAAS). You just deploy, and it's done. The VM is invisible and the site is always up.
However, there's not yet been a good solution under web sites for doing regular jobs and batch work in the background. Now Azure Web Sites support a thing called "Azure WebJobs" to solve this problem simply.
Scaling a Command Line application with Azure WebJobs
When I want to do something simple - like resize some images - I'll either write a script or a small .NET application. Things do get complex though when you want to take something simple and do it n times. Scaling a command line app to the cloud often involves a lot of yak shaving.
Let's say I want to take this function that works fine at the command line and run it in the cloud at scale.
public static void SquishNewlyUploadedPNGs(Stream input, Stream output)
{
var quantizer = new WuQuantizer();
using (var bitmap = new Bitmap(input))
{
using (var quantized = quantizer.QuantizeImage(bitmap))
{
quantized.Save(output, ImageFormat.Png);
}
}
}
WebJobs aims to make developing, running, and scaling this easier. They are built into Azure Websites and run in the same VM as your Web Sites.
Here's some typical scenarios that would be great for the Windows Azure WebJobs SDK:
- Image processing or other CPU-intensive work.
- Queue processing.
- RSS aggregation.
- File maintenance, such as aggregating or cleaning up log files.
- Other long-running tasks that you want to run in a background thread, such as sending emails.
WebJobs are invoked in two different ways, either they are triggered or they are continuously running. Triggered jobs happen on a schedule or when some event happens and Continuous jobs basically run a while loop.
WebJobs are deployed by copying them to the right place in the file-system (or using a designated API which will do the same). The following file types are accepted as runnable scripts that can be used as a job:
- .exe - .NET assemblies compiled with the WebJobs SDK
- .cmd, .bat, .exe (using windows cmd)
- .sh (using bash)
- .php (using php)
- .py (using python)
- .js (using node)
After you deploy your WebJobs from the portal, you can start and stop jobs, delete them, upload jobs as ZIP files, etc. You've got full control.
A good thing to point out, though, is that Azure WebJobs are more than just scheduled scripts, you can also create WebJobs as .NET projects written in C# or whatever.
Making a WebJob out of a command line app with the Windows Azure WebJobs SDK
WebJobs can effectively take some command line C# application with a function and turn it into a scalable WebJob. I spoke about this over the last few years in presentations when it was codenamed "SimpleBatch." This lets you write a simple console app to, say, resize an image, then move it up to the cloud and resize millions. Jobs can be triggered by the appearance of new items on an Azure Queue, or by new binary Blobs showing up in Azure Storage.
NOTE: You don't have to use the WebJobs SDK with the WebJobs feature of Windows Azure Web Sites. As noted earlier, the WebJobs feature enables you to upload and run any executable or script, whether or not it uses the WebJobs SDK framework.
I wanted to make a Web Job that would losslessly squish PNGs as I upload them to Azure storage. When new PNGs show up, the job should automatically run on these new PNGs. This is easy as a Command Line app using the nQuant open source library as in the code above.
Now I'll add the WebJobs SDK NuGet package (it's prerelease) and Microsoft.WindowsAzure.Jobs namespace, then add [BlobInput] and [BlobOutput] attributes, then start the JobHost() from Main. That's it.
using Microsoft.WindowsAzure.Jobs;
using nQuant;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
JobHost host = new JobHost();
host.RunAndBlock();
}
public static void SquishNewlyUploadedPNGs(
[BlobInput("input/{name}")] Stream input,
[BlobOutput("output/{name}")] Stream output)
{
var quantizer = new WuQuantizer();
using (var bitmap = new Bitmap(input))
{
using (var quantized = quantizer.QuantizeImage(bitmap))
{
quantized.Save(output, ImageFormat.Png);
}
}
}
}
}
CONTEXT: Let's just step back and process this for a second. All I had to do was spin up the JobHost and set up a few attributes. Minimal ceremony for maximum results. My console app is now processing information from Azure blob storage without ever referencing the Azure Blob Storage API!
The function is automatically called when a new blob (in my case, a new PNG) shows up in the input container in storage and the Stream parameters are automatically
"bound" (like Model Binding) for me by the WebJobs SDK.
To deploy, I zip up my app and upload it from the WebJobs section of my existing Azure Website in the Portal.
Here it is in the Portal.
I'm setting mine to continuous, but it can also run on a detailed schedule:
I need my WebJob to be told about what Azure Storage account it's going to use, so from my Azure Web Site under the Connection Strings section I set up two strings, one for the AzureJobsRuntime (for logging) and one for AzureJobsData (what I'm accessing).
For what I'm doing they are the same. The connection strings look like this:
DefaultEndpointsProtocol=https;AccountName=hanselstorage;AccountKey=3exLzmagickey
The key here came from Manage Access Keys in my storage account, here:
In my "Hanselstorage" Storage Container I made two areas, input and output. You can name yours whatever. You can also process from Queues, etc.
Now, going back to the code, look at the parameters to the Attributes I'm using:
public static void SquishNewlyUploadedPNGs(
[BlobInput("input/{name}")] Stream input,
[BlobOutput("output/{name}")] Stream output)
There's the strings "input" and "output" pointing to specific containers in my Storage account. Again, the actual storage account (Hanselstorage) is part of the connection string. That lets you reuse WebJobs in multiple sites, just by changing the connection strings.
There is a link to get to the Azure Web Jobs Dashboard to the right of your job, but the format for the URL to access is this: https://YOURSITE.scm.azurewebsites.net/azurejobs. You'll need to enter your same credentials you've used for Azure deployment.
Once you've uploaded your job, you'll see the registered function(s) here:
I've installed the Azure SDK and can access my storage live within Visual Studio. You can also try 3rd party apps like Cloudberry Explorer. Here I've uploaded a file called scottha.png into the input container.
After a few minutes the SDK will process the new blob (Queues are faster, but blobs are checked every 10 minutes), the job will run and either succeed or fail. If your app throws an exception you'll actually see it in the Invocation Details part of the site.
Here's a successful one. You can see it worked (it squished) because of the number of input bytes and the number of output bytes.
You can see the full output of what happens in a WebJob within this Dashboard, or check the log files directly via FTP. For me, I can explore my output container in Azure Storage and download or use the now-squished images. Again, this can be used for any large job whether it be processing images, OCR, log file analysis, SQL server cleanup, whatever you can think of.
Azure WebJobs is in preview, so there will be bugs, changing documentation and updates to the SDK but the general idea is there and it's solid. I think you'll dig it.
Related Links
- Azure Friday - 10 minute tutorials that help you learn Azure. Also on iTunes!
- WebJobs Getting Started Tutorial
- WebJobs Samples on CodePlex
- Links on how to use Azure WebJobs
- Azure WebJobs on Windows Azure
Sponsor: Big thanks to combit for sponsoring the blog feed this week! Enjoy feature-rich report designing: Discover the reporting tool of choice for thousands of developers. List & Label is an award-winning component with a royalty-free report designer. Free trial!
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
A couple of questions. how does this work with auto scaling? If i'm running on two instances will my job only run on one of those instances or will i need to plan for my jobs to scale with the sites. Also, If my job is cpu intensive and i'm using cpu auto scale, will my job trigger the scale out?
for triggered - only a single instance of a run can run at a given time so when triggered the webjob will run on one of your instances (randomly).
Keep up the great work!
This seems much better ... but ... the deployment process is terrible. I need to upload a zip every time I want to release? Yuck!
This would be perfect if there was some way to trigger deployments from Git, or at least through Visual Studio.
Can you elaborate on 'you can deploy using Kudu'? I can see a commit where WebJobs support was added to Kudu (https://github.com/projectkudu/kudu/commit/59561993c49d5935ce91ca062d800a6e57b1ad5f) but am not sure how to use it.
Connection string shows the Account Key!!
Excellent article. I have been job based processing for one my big project in asp.net timer web site using timers & threads.
But that was not a convenient & traceable solution.
With the invent of this tool, it can help a lot to design a queue based & job based processing script.
Gr8.
Thanks for sharing.
Just a question, there is no info about pricing, it will be part of Web Sites pricing model or something new?
Two questions though:
1) how will this scale? Does is start multiple instances and balance the work?
2) will there be VCS access like for websites, so one can push updates?
Btw, do you know why a website can't listen to all request? I have seen that requests with a different 'Host' header get blocked and never reach the website :-/ this is sadly the reason why I have to use CloudServices for stuff, where I don't know all domains in advance.
Is it possible to access an Azure SQL Database from a webjob?
In my website I need to aggregate some data (read the purchase orders table and fill in the stats table)every 24 hours: do you think a webjob could be appropriate or would it be better to use the Scheduler service?
Thanks
* No need to pay extra for this feature, note that for continuous WebJobs there is an important feature called "always on" which is only for Standard Website, this will make sure your Website and WebJob are always up (won't be the case for free/shared websites), so you can experiment using free but for full usage you need standard.
* You can access from a WebJob the same things as from a Website, if you use .NET application you can even access your app settings and connection strings as in your Website (for other types you can use environment variables to access those).
Timm - Regarding scale see my previous comment regarding your issue, you can start a thread on the windows azure forum.
a@http://blog.amitapple.com/post/74215124623/deploy-azure-webjobs
a@http://blog.amitapple.com/post/73574681678/git-deploy-console-app
I just want to reliably run some background activity in my on-premise ASP.NET Web App without complicating my deployment beyond Web Deploy or relying upon hacks that are susceptible to app pool recycling.
This feature should have just been called "Jobs" and not be tied at all to Web Sites. This could have just been an update to Worker Roles instead of introducing a whole new redundant thing.
This is great Scott as I currently use a cron job to periodically call a web api to dump a work request on a ServiceBus Q which is then picked up by the Worker Role to 'do stuff'. Azure really is a great toolkit and it keeps getting better!
Sayed Ibrahim Hashimi | @SayedIHashimi
Microsoft.WindowsAzure.Jobs.Host is another Nuget package that is needed.
As stated already, keep-up the great work, Webjobs are a great and simple add-on for long running/triggered tasks that compliment WebSites, without the overhead of a defining/managing a full Worker Role.
I'm trying to use webjobs to hit a specific url on my associated azure website per instance. Someone mentioned above that Continuous web jobs will fire on every instance you have if you scaled out. On each instance could I have the web job somehow (via webclient?) hit a url on that instance only?
Regards,
Matt
Any quick way to detect that "another" instance is running and simply just sit there do nothing.
Having the option to select "single instance only" on WebJobs would be great.
This is the very first time I frequented your website page
and to this point? I surprised with the research you made to create this particular post extraordinary.
Great job!
Status changed to Running
[03/26/2014 23:41:58 > 8fa263: ERR ] 'IncidentPoll.exe' is not recognized as an internal or external command,
[03/26/2014 23:41:58 > 8fa263: ERR ] operable program or batch file.
Any ideas?
Moreover, if someone implement support for Azure Queue and Table Storage in HangFire, there will be much simpler alternative to WebJobs for Windows Azure.
Comments are closed.