Cloud-Controlled Remote Pan Tilt Zoom Camera API for a Logitech BCC950 Camera with Azure and SignalR

October 23, 2012 Comment on this post [33] Posted in Hardware | Lync | Open Source | Remote Work

The Solution...er, the Problem

I have found my camera and built my solution. The Logitech BCC950 Conference Cam is the best balance between cost and quality and it's got Pan Tilt and (digital) Zoom functionality. The Zoom is less interesting to me than the motorized Pan Tilt.

Let's think about the constraints.

A Logitech BCC950 PTZ camera is installed on a Windows machine in my office in Seattle.
I'm anywhere. I'm usually in Portland but could be in a hotel.
- I may or may not be VPN'ed into work. This means I want to be able to communicate with the camera across networks, traverse NATs and generally not worry about being able to connect.
I want to be able to control the camera in a number of ways, Web API, whatever, but ideally with cool buttons that are (or look) integrated with my corporate instant messaging system.

There's three interesting parts here, then.

Can I even control the camera's PTZ functions programmatically?
Can I relay messages across networks to the camera?
Can I make a slick client interface easily?

Let's figure them out one at a time.

Can I even control the camera's PTZ functions programmatically?

I looked all over and googled my brains out trying to find an API to talk to the Logitech camera. I emailed the Logitech people and they folks me that the camera would respond to DirectShow APIs. This means I can control the camera without any drivers!

MSDN showed me PROPSETID_VIDCAP_CAMERACONTROL which has an enumeration that includes things like:

This lead me to this seven year old DirectShow .NET library that wraps the hardest parts of the DirectShow COM API. There's a little utility called GraphEdt.exe (GraphEdit) that you can get in the Windows SDK that lets you look at all the DirectShow-y things and devices and filters on your system.

GraphEdit

This utility let me control the camera's Zoom but Pan and Tilt were grayed out! Why?

GraphEdit showing Pan and Tilt grayed out

Turns out that this Logitech Camera supports only relative Pan and Tilt, not absolute. Whatever code that creates this Properties dialog was never updated to support a relative pan and tilt but the API supports it via KSPROPERTY_CAMERACONTROL_PAN_RELATIVE!

That means I can send a start message quickly followed by a stop message to pan. It's not super exact, but it should work.

Here's the C# code for my move() method. Note the scandalous Thread.Sleep call.

private void MoveInternal(KSProperties.CameraControlFeature axis, int value)
{
    // Create and prepare data structures
    var control = new KSProperties.KSPROPERTY_CAMERACONTROL_S();

    IntPtr controlData = Marshal.AllocCoTaskMem(Marshal.SizeOf(control));
    IntPtr instData = Marshal.AllocCoTaskMem(Marshal.SizeOf(control.Instance));

    control.Instance.Value = value;

    //TODO: Fix for Absolute
    control.Instance.Flags = (int)CameraControlFlags.Relative;

    Marshal.StructureToPtr(control, controlData, true);
    Marshal.StructureToPtr(control.Instance, instData, true);
    var hr2 = _ksPropertySet.Set(PROPSETID_VIDCAP_CAMERACONTROL, (int)axis,
       instData, Marshal.SizeOf(control.Instance), controlData, Marshal.SizeOf(control));

    //TODO: It's a DC motor, no better way?
    Thread.Sleep(20);

    control.Instance.Value = 0; //STOP!
    control.Instance.Flags = (int)CameraControlFlags.Relative;

    Marshal.StructureToPtr(control, controlData, true);
    Marshal.StructureToPtr(control.Instance, instData, true);
    var hr3 = _ksPropertySet.Set(PROPSETID_VIDCAP_CAMERACONTROL, (int)axis,
       instData, Marshal.SizeOf(control.Instance), controlData, Marshal.SizeOf(control));

    if (controlData != IntPtr.Zero) { Marshal.FreeCoTaskMem(controlData); }
    if (instData != IntPtr.Zero) { Marshal.FreeCoTaskMem(instData); }
}

All the code for this PTZDevice wrapper is here. Once that library was working, creating a little console app to move the camera around with a keyboard was trivial.

var p = PTZDevice.GetDevice(ConfigurationManager.AppSettings["DeviceName"], PTZType.Relative);
while (true)
{
    ConsoleKeyInfo info = Console.ReadKey();
    if (info.Key == ConsoleKey.LeftArrow)
    {
        p.Move(-1, 0);
    }
    else if (info.Key == ConsoleKey.RightArrow)
    {
        p.Move(1, 0);
    }
    else if (info.Key == ConsoleKey.UpArrow)
    {
        p.Move(0, 1);
    }
    else if (info.Key == ConsoleKey.DownArrow)
    {
        p.Move(0, -1);
    }
    else if (info.Key == ConsoleKey.Home)
    {
        p.Zoom(1);
    }
    else if (info.Key == ConsoleKey.End)
    {
        p.Zoom(-1);
    }
}

Also easy was a simple WebAPI. (I put the name of the camera to look for in a config file in both these cases.)

[HttpPost]
public void Move(int x, int y)
{
    var p = PTZDevice.GetDevice(ConfigurationManager.AppSettings["DeviceName"], PTZType.Relative);
    p.Move(x,y);
}

[HttpPost]
public void Zoom(int value)
{
    var p = PTZDevice.GetDevice(ConfigurationManager.AppSettings["DeviceName"], PTZType.Relative);
    p.Zoom(value);
}

At this point I've got the camera moving LOCALLY. Next, I mail it to Damian (my office buddy) in Seattle and he hooks it up to my office computer. But I need something to control it running on THAT machine...and talking to what?

Can I relay messages across networks to the camera?

Here's the architecture. Since I can't talk point to point via TCP between wherever I am and wherever the camera is, I need a relay. I could use a Service Bus Relay which would be great for something larger but I wanted to see if I could make something even simpler. I'd like to use HTTP since it's, well, it's HTTP.

A Diagram showing my laptop talksk via SignalR through Azure to the camera in Seattle

Since Azure lets me have 10 free websites and automatically supports SSL via a wildcard cert for sites at the *.azurewebsites.net domain, it was perfect for what I needed. I want to use SSL because it's the best way to guarantee that my traffic not be affected by corporate proxy servers.

There's three parts. Let's start in the middle. What's the Relay look like? I'm going to use SignalR because it will let me not only call methods easily and asynchronously but, more importantly, it will abstract away the connection details from me. I'm looking to relay messages over a psuedo-persistent connection.

So what's the code look like for a complex relay system like this? ;)

using System;
using SignalR.Hubs;

namespace PTZSignalRRelay
{
    public class RelayHub : Hub
    {
        public void Move(int x, int y, string groupName)
        {
            Clients[groupName].Move(x, y); //test
        }

        public void Zoom(int value, string groupName)
        {
            Clients[groupName].Zoom(value);
        }

        public void JoinRelay(string groupName)
        {
            Groups.Add(Context.ConnectionId, groupName);
        }
    }
}

Crazy, eh? That's it. Clients call JoinRelay with a name. The name is the name of the computer with the camera attached. (More on this later) This means that this single relay can handle effectively any number of clients. When a client calls to Relay with a message and group name, the relay then broadcasts to clients that have that group name.

Can I make a slick client interface easily?

I created a super basic WPF app that's just a transparent window with buttons. In fact, the background isn't white or black, it's transparent. It's a SolidColorBrush that is all but invisible. It's not totally transparent or I wouldn't be able to grab it with the mouse!

<SolidColorBrush x:Key="NotQuiteTransparent" Color="#01000000"></SolidColorBrush>

The buttons use the .NET SignalR library and call it like this.

HubConnection connection = null;
IHubProxy proxy = null;
string remoteGroup;
string url;

private void MainWindow_MouseDown(object sender, MouseButtonEventArgs e)
{
    if (e.ChangedButton == MouseButton.Left)
        this.DragMove();
}

private async void MoveClick(object sender, RoutedEventArgs e)
{
    var ui = sender as Control;
    Point p = Point.Parse(ui.Tag.ToString());
    await proxy.Invoke("Move", p.X, p.Y, remoteGroup);
}

private async void ZoomClick(object sender, RoutedEventArgs e)
{
    var ui = sender as Control;
    int z = int.Parse(ui.Tag.ToString());
    await proxy.Invoke("Zoom", z, remoteGroup);
}

private async void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
    url = ConfigurationManager.AppSettings["relayServerUrl"];
    remoteGroup = ConfigurationManager.AppSettings["remoteGroup"];
    connection = new HubConnection(url);
    proxy = connection.CreateProxy("RelayHub");
    await connection.Start();
    await proxy.Invoke("JoinRelay", remoteGroup);
}

The client app just needs to know the name of the computer with the camera it wants to control. That's the "GroupName" or in this case, from the client side, the "RemoteGroup." Then it knows the Relay Server URL, like https://foofooserver.azurewebsites.net. The .NET client uses async and await to make the calls non-blocking so the UI remains responsive.

Here's a bunch of traffic going through the Relay while I was testing it this afternoon, as seen by the Azure Dashboard.

Traffic as shown in a graph on the Azure Dashboard

The client calls the Relay and the Relay broadcasts to connected clients. The Remote Camera Listener responds to the calls. We get the machine name, join the relay and setup two methods that will respond to Move and Zoom.

The only hard thing we ran into (Thanks David Fowler!) was that the calls to the DirectShow API actually have to be on a UI thread rather than a background thread, so we have to get the current SynchronizationContext and post our messages with it. This results in a little indirection but it's not too hard to read. Note the comments.

private async void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
    var deviceName = ConfigurationManager.AppSettings["DeviceName"];
    device = PTZDevice.GetDevice(deviceName, PTZType.Relative);

    url = ConfigurationManager.AppSettings["relayServerUrl"];
    remoteGroup = Environment.MachineName; //They have to hardcode the group, but for us it's our machine name
    connection = new HubConnection(url);
    proxy = connection.CreateProxy("RelayHub");

    //Can't do this here because DirectShow has to be on the UI thread!
    // This would cause an obscure COM casting error with no clue what's up. So, um, ya.
    //proxy.On<int, int>("Move",(x,y) => device.Move(x, y));
    //proxy.On<int>("Zoom", (z) => device.Zoom(z));

    magic = SynchronizationContext.Current;

    proxy.On<int, int>("Move", (x, y) => {
        //Toss this over the fence from this background thread to the UI thread
        magic.Post((_) => {
            Log(String.Format("Move({0},{1})", x,y));
            device.Move(x, y);
        }, null);
    });

    proxy.On<int>("Zoom", (z) => {
        magic.Post((_) =>
        {
            Log(String.Format("Zoom({0})", z));
            device.Zoom(z);
        }, null);
    });

    try {
        await connection.Start();
        Log("After connection.Start()");
        await proxy.Invoke("JoinRelay", remoteGroup);
        Log("After JoinRelay");
    }
    catch (Exception pants) {
        var foo = (WebException)pants.GetBaseException();
        StreamReader r = new StreamReader(foo.Response.GetResponseStream());
        string yousuck = r.ReadToEnd();
        Log(yousuck);
        throw;
    }
}

It All Works Together

Now I've got all the parts. Buttons that call a Relay that then call back - through NAT and networks - to the Remote Camera Listener which uses the Camera library to move it.

It's ALIVE and it's awesome

It works like a champ. And, because the buttons are transparent, I can put them over the Lync window and pretend it's all integrated.

TODO: I'm hoping that someone who knows more about Windows Internals will volunteer to create some code that will automatically move the buttons as the Lync Window moves and position them over the video window in the corner. Ahem.

You can set this up yourself, but I haven't gotten around to making an install or anything. If you have a Logitech BCC950 you are welcome to use my Relay until it costs me something. There's a preliminary download up here so you'd only need the Listener on one side and the Buttons on the other. No drivers are needed since we're using DirectShow itself.

This was great fun, and more importantly, I use this PanTiltZoom System ever day and it makes my life better. The best was that I was able to do the whole thing in C#. From client UI to cloud-based relay to device control to COM wrapper, it was all C#. It makes me feel very empowered as a .NET developer to be able to make systems like this with a minimal amount of code.

Lync Developer Resources

Related Links

Sponsor: Big thanks to this week's sponsor. Check them out, it's a great program, I've done it and spoken to actual live humans who help you get started writing an app! Begin your 30-day journey to creating a Windows Store app or game for Windows 8 or Windows Phone today. Your Idea. Your App. 30 Days.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

About Newsletter

Hosting By

Hosted on Linux using .NET in an Azure App Service

Comment on this post [33]

Share on BlueSky or use the Permalink and post anywhere!

October 23, 2012 13:08

Nice stuff. I have used ::SetWindowsHookEx(WH_CALLWNDPROCRET,...) to inject windows into Excel. You may be able to do the same to control the location of your buttons. I had to do the message filtering in C++ as marshalling all messages (50+ per/sec at times) into C# was CPU intensive.

Andy van Stokkum

October 23, 2012 16:51

Cool, but I am not sure how this helps you communicate with others if you are not in the office. I have thought of scenarios, but cant figure out what actually happens in the real world.For example, the camera at your desk, in the screen shots there is no one there so who are you communicating with...and why don't they just use their own camera? What does the pan, zoom, tilt really offer?

Joe C

October 23, 2012 18:03

Joe,

You're more than likely right about that. My guess is that is a conference room of some degree. Personally, I thought the use would probably be in someone remotely monitoring their house for those that are into that sort of thing (I'd probably benefit from it purely because my short term memory is horrid and I find myself driving back to the driveway when I question whether I locked the door after driving 40 ft away).

All in all, as a web-only guy this is incredibly neat to me. I wonder how long before someone does it on Windows 8 (take that, Android@Home).

Scott, you unintentionally answered a question I was thinking about this morning (I think. I believe it is answered, anyway). My knowledge of SignalR is non-existent (currently), but I was actually wondering if SignalR and async competed with each other or complemented each other. Judging from what I got from this, they don't appear to really overlap but you could use them together so the complementing is closer, right?

Robert

October 23, 2012 21:55

Joe - not a conference room, a shared working space with a dozen whiteboards, although any large conference room would also find this useful. No good wide angle lenses exist for webcams so you can't see more than 3-4 people in a conference room filled with 15 to 20. When you add in whiteboards the only way you can SEE everyone and everything thing is by turning your head. Or a camera.

Scott Hanselman

October 23, 2012 22:01

Robert - yes, unrelated but complimentary. It's easier to work with asynchronous protocols using language features that hide asynchronous complexities.

Scott Hanselman

October 24, 2012 4:10

amazing .... :-)

salem albadawi

October 24, 2012 15:56

Great~

Ray

October 25, 2012 1:03

Scott, now imagine having an App with the camera's feed where you tilt the image by swiping the screen with your finger. Nice work!

Carlos

October 26, 2012 10:45

Awesome work Scott and Co.

Anyway that one can use Skype instead of Lync, as our business does not have a Lync Server?

Specifically, how would one replace the Lync references in the UI Suppressed or Non-Suppressed Auto Answer App with Skype references?

Jacques Coertze

October 27, 2012 1:12

Great stuff!
I have one question though.. Why is it necessary for your machine (the one with WPF app) to join the relay? I would have thought its enough just to call zoom/tilt methods with the right group name? SignalR then only sends the command to the computer wirh the camera.
Sorry if im asking something obvious but it currently doesnt make much sense to me:)

Bine

October 27, 2012 1:32

Only three days out from a percent coincidence! :o)
http://dilbert.com/strips/comic/2012-10-26/

Richard

October 28, 2012 19:05

AWESOME :D

Sebastian Huppmann

October 29, 2012 12:52

Mind blowing!!

lisaflorence

November 02, 2012 4:31

No good wide angle lenses exist for webcams so you can't see more than 3-4 people in a conference room filled with 15 to 20.

Hi Scott, I am also a remote worker in Melbourne, Australia and had found similar issues being part of meetings. For a long time we had a person in the conference room responsible for manually aiming a webcam, which wasn't ideal. Earlier this year though I found the Genius WideCam 1050 webcam which has a 120 degree wide angle lens. It has made attending meetings much better. We sit it at the end of the table and it's like sitting at the table. I can see everyone.

The main drawback is that everything in the image is much smaller. This has two consequences. First, as you pointed out, being able to see whiteboards is a problem. We get around this by using oovoo for the video and adding a laptop with another webcam just on the whiteboard.
The second is that you can't see as much detail in people's faces, but I think seeing everyone outweighs this. I'd like to try the F100 Full HD version to see if that improves things, but it doesn't seem to be available in Australia.

Also, if you find that you're using GraphEdt.exe much, like I do, then you might be interested in monogram graphstudio which is an open source version of the GraphEdit tool with heaps of extra handy features added.

Nate

Nate

November 05, 2012 13:49

Your window position tracking code: Gist 4016340. Works on my machine (if start a video call it shows and tracks the window, if I end the call it hides).

I would have preferred to do it with hooks (push vs. poll), but, you need to delve into C++ to do that and I didn't feel like it. So the code uses P/Invoke to figure out if the window we want is active (once a second) and once it finds once it figures out where it is (every 100ms). I went and did this for Lync - at least as far as my version is involved. If you need to hack it yourself for your version grab Window Detective (drag the "Pick Window" toolbar button to the Lync window) and figure out what gets updated when a video session is active. For my version the *last* window (vis a vi control) with the class "CtrlControlSink", that contains a "LCC_VideoParent" is shown (WindowStyles.Visible). Hopefully my P/Invoke wrapping efforts make it obvious.

Jonathan Dickinson

November 13, 2012 13:20

I cannot run the preliminary download or compile the source. I suspect that is because I am running Windows 7 64 bit. What are you running?

But I have a bumblebee question. When I look at the uvc 1.1 specs and descriptors for the BCC950 there seems to be a mismatch concerning pan relative and tilt relative. Only the latter is checked in the BCC950 descriptors. So my question is how does you bumblebee fly?

archn

November 13, 2012 21:17

Cool post!

+1 for the Genius WideCam 1050. We have several of them throughout our offices and they work great. As Nate said, they are more for the feeling of "being there" than reading whiteboards, but they do improve the experience quite a lot. One thing we don't like about them is the integrated microphone, so we actually use the CadAudio U7 microphone for sound.

Adrian Hara

November 13, 2012 21:40

archn - Can you give me more details on how you are trying to compile it? It compiles straight on Windows 7 and 8, 64bit. Make sure you read the readme, you'll likely need to get the GPL'ed library and put it in the lib folder.

Scott Hanselman

November 14, 2012 14:36

Great work! But you have forgotten kinect!

Think about using the windows kinect sdk. With audio and skeleton recognition you can determine the speaker and where he sits and then move the camera to him automagically.

Or you can implement person buttons in your remote control and just click on the person you want to see.

I'm sure one day you will present us with an update of this project :-)

MikeH

November 20, 2012 23:12

Great work Scott!

Any chance you could release a compiled version of the basic applet that does the PTZ for us non-coder type remote workers?

Max

November 25, 2012 4:22

Nice article. The only thing is, for me i'm looking for a camera with optical zoom in addition to pan and tilt functionality.

Will this solution work with any PTZ camera or does it have to have some specific compatibility requirements?

sam s

November 28, 2012 15:45

hello Scott, I've downloaded the source and try to compile it with visual studio 2008, I notice it require donet framework > 4.0 is needed right? because compile error with HasFlag. So I research in web for another solution for framework < 4.0. finally I change it to return ((supported.GetType() != KSPropertySupport.Set.GetType()) && (supported.GetType() != KSPropertySupport.Get.GetType())); compile process done and happy. I tried to test it on windows 7 pc, it throw NotSupportedException "This camera doesn't appear to support Relative Pan and Tilt". I use the same Webcam Logitech BCC 950. I also downloaded your compiled version, and it works. can you help me pls? THX

Michael

November 29, 2012 0:53

Michael - There is a compiled version at the Downloads tab and yes it needs .NET4.

Scott Hanselman

November 29, 2012 11:38

I tested your compiled version on other windows 7 pc, it also throw NotSupportedException "This camera doesn't appear to support Relative Pan and Tilt". Do you know what is missing or whats require to work except the webcam?

Michael

December 04, 2012 18:34

Scott, thanks for another great tool in the virtual presence arsenal. We just implemented it for our remote employee and it works great! We're still trying to get your auto-answer working with Lync 2013 so we're currently using Skype. It works really well that you implemented the control as an overlay so that it doesn't matter what video chat program is being used.

Thanks again!

Shawn

Shawn Riesterer

December 04, 2012 18:58

Heck yeah, this is awesome. Might tweak it to add in a "hold" ability on the buttons, and maybe some presets to focus on different areas quickly, but this is a great start! Thanks again!

Brandon Martinez

September 26, 2013 5:38

Great to see you have done this. I remain surprised that remote control of these cameras in video conferencing systems is not considered a must-have standard feature, rather than something we have to hack together.

If you do decide to improve it, may I suggest making it so that it doesn't need a remote client, but can run from a browser, relaying actual https to the relay server or the camera. That way people could control from all platforms with no install. I realize the camera needs to be in windows to use your code base and the APIs you found.

Brad Templeton

September 26, 2013 22:41

Brad - Yes, I've already done that in the same codebase. You can call it from a browser or with JavaScript. Example code here.

Scott Hanselman

November 06, 2013 21:23

Dear Scott,

I'm playing around with my BCC950 and your c# code, but I'm stuck on the xml.config file (for the PTZControl program) where you describe the Device to be connected to.

Just
<appSettings>
<add DeviceName = "BCC950 ConferenceCam" />
</appSettings>

is not working. You have a clue?

Thanks & Cheers Enrico

Enrico

November 07, 2013 0:19

Hi Scott,

I'm running into an unusual problem... I have a BCC950 but none of my code, your code or Logitech's PTZ example code seems to work. For example, your PTZDeviceConsole.exe gives this error:

C:\Users\Ian\Desktop\PanTiltZoomSystemv0.9\PanTiltZoomSystemv0.9\ConsoleTester>P
TZDeviceConsole.exe

Unhandled Exception: System.NotSupportedException: This camera doesn't appear to
support Relative Pan and Tilt
at PTZ.PTZDevice..ctor(String name, PTZType type)
at PTZ.PTZDevice.GetDevice(String name, PTZType type)
at PTZDeviceConsole.Program.Main(String[] args)

Any clue what might be causing this error? Essentially what I've found is that my system recognizes the camera, but any attempt to get() or set() camera properties fails.

For Example:(SupportFor(KSProperties.CameraControlFeature.KSPROPERTY_CAMERACONTROL_PAN_RELATIVE)

I'm starting to think it's something firmware related...Though I haven't found any tools to flashing the firmware.

Ian

November 07, 2013 20:12

I'm testing the console application and it only responds to my left and right arrow keys. It does not respond to UP/DOWN at all.

Any clues?

Brian

November 08, 2013 18:31

in my case, the camera is just responding to zoom.

Enrico

November 13, 2013 21:53

Hi Scott, Is there a way this could work independently of Lync? I use Vidyo conferencing sofware and would love for the people on the other side to be able to control my camera (Logitech BCC950) remotely, would there be a way to have it worked independently of Lync? Or perhaps use some components of Lync so as to be able to use it on Vidyo?

It would be really good to have that, as the lack of controls is what is preventing me from buying the BCC950. Thanks!

Luis

Comments are closed.

Scott Hanselman