Tuesday 28 February 2012

Gesture Controls for Natural User Interfaces

I've been working on a few projects and I have returned to my Windows Media Player Natural User Interface. The newest version is very much a learning experience, and while it is currently fairly untested, there is much I can share. I thought I would start with my thoughts on hand gestures for my media player.

My first thoughts was to greatly limit the controls available for hand gestures to allow it to be as simple as possible. I decided on play/pause, fast forward, rewind, volume up and down. To turn on the "virtual remote" the user would raise their hand to about shoulder height, and press forward. Once it is activated, the coloured button would be the selected button and the user could press forward to activate it's control. To remove the remote, the hand would be lowered.

Originally I decided to control the remote similar to a mouse. The hand would essentially be a pointer, and you'd hover over a button and select. This quickly showed to be difficult. A major issue was that people would tend to push with their elbow causing the hand to lower and the target below, or no target, to be selected. I quickly decided to opt out of the "mouse-like" controls.

I've named the method we use in the newer version a "bump" style selection. This meant that a user could select  a button by bumping their hand in which ever direction. This removed the possibility of selecting the incorrect button.

With more testing we noticed a few more issues. Lowering the hand would select volume down always, and people would like to control volume up and down more easily, as well as their was excessive motions when selecting fast forward or rewind.



I decided that no one would select fast forward or rewind without wanting to press it. To use fast forward, a user would have to select fast forward, press fast forward, select play, press play. It seemed to be easier to just allow the user to select fast forward, immediately activate control of fast forward and move the selection to play. Now to fast forward a user would bump their hand to the right, then press forward to play again, reducing the amount of controls by 50%.

Finally, I made the volume controls a slider. A user bumps up, moves hand left and right, and bumps down.

These controls are incredibly slick and easy to use. The newest issue is people have a hard time wanting to keep their hand raised while using the controls and will lower them constantly while deciding which button to use. But that will be for another time.

(The newest version has also been ported over to WPF Applications for more slick design)

Anyone interested in the code for the remote can see it here

Friday 22 July 2011

Windows Media Player NUI

**WARNING** If you would like to try out this software, you may need to change this line:
            //set WMP source
            axWindowsMediaPlayer.URL = @"C:\Users\Public\Videos\Sample Videos\Wildlife.wmv";

That is where a default windows video is located on my computer, but may not be on yours. You'll need to change it to a file on your computer. 

Demo 1
Demo 2/Tutorial

I've been interested in Natural User Interfaces for some time and frequent websites such as Nui Group often. With the release of the Kinect SDK I decided to give making a Media Player NUI a shot and released a demo a few weeks ago. Since then Mark and I made some changes, and thought I'd do an update, show some code, and allow others to download and try it. We intend to make another demo soon showing the new features. This code requires having the Kinect NUI and Audio operational as well as the Windows Media Player SDK.

We started with a Windows Form Application in Microsoft Visual C# 2010 Express and put a media player and a timer on the screen, as well as following all the setup required for the Kinect

The meat of the code is here.

        const int STOP_TIMER = 10;
        int timer = 10;

         void nui_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
        {
            //reset watchdog timer that stops the video from playing
            timer = STOP_TIMER;
        }
         private void timer1_Tick(object sender, EventArgs e)
        {
            timer--;
            if (timer < 0)
            {
                voice_enabled = true;
                axWindowsMediaPlayer.Ctlcontrols.pause();
            }
        }

We declared some variable and value which counts down constantly using a timer. This variable is reset whenever the Kinect uses the nui_SkeletonFrameReady function, which is whenever it thinks it sees a skeleton. The timer will not reach 0 until the kinect is tracking nothing, at which point it will enable voice commands and pause the media player.

Essentially, what we wanted to do is make it so that if someone had to run off and answer the door, get the phone, save a life, that the video they were watching would pause for them without the need of finding a remote. This implementation currently pauses once there is no one watching the video, but could be adapted to if anyone leaves.


Another feature we created was a simple, hand controlled remote for the screen. There are no preference controls for it currently, and is set to what I thought was comfortable. We tried multiple methods for creating a remote on screen, such as having the mouse follow the right hand, however we found other methods to be easier.


        public float z_last = 0;
        public float z_avg = 0;
        public float delta_z = 0;
        public float z_current = 0;

        void nui_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
        {
            SkeletonFrame skeletonFrame = e.SkeletonFrame;
            foreach (SkeletonData data in skeletonFrame.Skeletons)
            {
                if (data.TrackingState == SkeletonTrackingState.Tracked)
                {
                    z_current = data.Joints[JointID.WristRight].Position.Z;
                    delta_z = z_current - z_last;
                    if (data.Joints[JointID.WristRight].Position.Y > data.Joints[JointID.Spine].Position.Y)
                    {
                        z_avg = (float)((z_avg * 0.7) + (delta_z * 0.3));
                    }
                    else
                    {
                        z_avg = 0;
                        //Part of the bump version of the remote
                        pictureBox1.Visible = false;
                        remote_enabled = false;
                    }
                    z_last = z_current;
                    .
                    .
                    .
We actually track data for the x y and z movements of the right wrist, however I thought I'd only explain one. We keep track of the wrists current position, last position, and find it's change in position (delta). If the wrist is above the spine, we then calculate it's average movement over some distance. Doing this, we can compare it's movement and see if it exceeded some limit, which we can use for various remote controls.
                    if (z_avg < -remote_enabled_limit && !remote_enabled)
                    {
                        remote_enabled = true;
                        remote_enabled_timer = -50;
                        remote_option = "PLAY";
                    }

Here we see if the z's average (which is forward and backward movement) exceeds a limit set which enables the remote. This is how we compare movement in the negative direction of z's field. Essentially, holding your hand at should level and pushing forward with a reasonable speed will enable a remote on screen which then has buttons that can be controlled using x y and z gestures.


The remote has volume up, down, fast forward, rewind, play, pause. By bumping up or down you can select volume controls.

To change the volume, you smoothly press forward and the volume rapidly changes, you do not need to constantly shake your hand forward and back, though you may if you would like.

Fast Forward and Rewind automatically is executed when the hand gestures right or left. I assume that if you are selecting fast forward or rewind, it would be because you would like to use them, not because you'd like to just hover your hand over the button.

Play and Pause is often disabled for less than a second to not have players accidentally click it after other buttons. It can be used by pressing forward over the button with a reasonable amount of movement.

The remote disappears once the hand is below the spine... I should probably change that to waist as it sometimes disappears when I select volume down.


The program also allows voice commands. We've done nothing special with the voice commands and learned how to do it straight off the Kinect demo. Current commands include "play", "rewind", "pause", "stop", "fast forward", "volume up", "volume down", "up", "down", "mute", "fullscreen", "maximize", "remote".

Voice commands are not enabled while the video is playing without a gesture.
Mark and I found gestures rather difficult to write (which may become our next adventure into the kinect world) and have written only one gesture this complicated.
                    //compare joints :(

                    //check for hands below head and above shoulders
                    if ((data.Joints[JointID.HandLeft].Position.Y < data.Joints[JointID.Head].Position.Y) && (data.Joints[JointID.HandLeft].Position.Y > data.Joints[JointID.ShoulderCenter].Position.Y))
                    {
                        if ((data.Joints[JointID.HandRight].Position.Y < data.Joints[JointID.Head].Position.Y) && (data.Joints[JointID.HandRight].Position.Y > data.Joints[JointID.ShoulderCenter].Position.Y))
                        {
                            //check hands are bent indwards
                            if ((data.Joints[JointID.HandLeft].Position.X > data.Joints[JointID.ElbowLeft].Position.X) && (data.Joints[JointID.HandRight].Position.X < data.Joints[JointID.ElbowRight].Position.X))
                            {
                                //check elbows are below shoulders
                                if ((data.Joints[JointID.ElbowLeft].Position.Y < data.Joints[JointID.ShoulderCenter].Position.Y) && (data.Joints[JointID.ElbowRight].Position.Y < data.Joints[JointID.ShoulderCenter].Position.Y))
                                {
                                    //Yay we did it!
                                    voice_enabled = true;
                                    voice_enabled_timer = 0;

                                }
                            }
                        }
                    }
This is the messy code we did to check, essentially, that the user put a hand on both sides of their mouth, similar to what one would do when shouting over a distance.  This enables voice commands while the video is playing, though for a short time. While the video is not playing however, voice commands can be done without this gesture.


Known Issues:
There were errors on closing the program. I have not seen them in sometime and kept pushing off fixing them. I've done nothing to fix it, so I assume they are still there.
Fullscreen hides the remote, but leaves it operating.
Threading makes things difficult to control, it's a feature, not a bug.
Voice commands are not as top notch as one may hope.

Required Software:
Kinect Audio SDK
Kinect NUI SDK
Windows Media Player SDK
Our code

Sunday 3 July 2011

Xbox Kinect SDK

So the Xbox Kinect SDK released a little while back over at Microsoft Research Labs and is quite interesting. I've been playing around with it and learning some C# at the same time.

The SDK allows audio, depth, and skeletal tracking, among other things, and comes with a plethora of demos. Simply, one can have 2 active skeletons with joints labeled with X, Y, and Z axis values, as well as many values to determine accuracy of the values, smoothing of animation, and much more.

For my first project, a friend of mine (Mark Arnott) messed around just testing each data field and putting them on the screen. We managed to make it understand some gestures, such as pushing our hand towards the display, and enlarging, shrinking and moving a picture using the position and distance between our hands.

I've for some time though that a NUI would be great for a media player. I've downloaded the WMP SDK and Mark and I have managed to create some NUI controls for the WMP.

Exiting the frame will cause the media player to pause.
Saying "Rewind" when paused will cause the media player to rewind.
Saying "Play" when paused will cause the media player to play.

We are still considering other controls that would be good, such as proper gestures for volume, and maybe controls to rewind 3 seconds and begin again. The main idea is if the phone rings, or you want to go to the bathroom, or for any reason, it is mildly annoying to find the remote, push the pauses button, and then continue with what you wanted to do. Also, when no one is watching the TV or a Movie, does the thing really need to keep going?

Check out our demo here.

Friday 11 March 2011

Side Projects

It makes complete sense, but I've only recently discovered the importance of side projects, so I thought I would share some thoughts about it for fellow students.

When you go looking for a co-op or internship or a job, you may be asked for things you've worked on. Although school projects demonstrate how well you can preform at assignments, some companies are equally interested in what how well you can preform at things you are interested in. Working on your own projects is great study, and a great time to practice and hone your skills. If your interests are making add-ons for firefox, or reading your dog's brain waves, give it a shot, it can't hurt.

I was lucky, and had a genuine interest in side projects, and if you want to do well in the computer science world, or any field really, you must develop a genuine interest in your field. In your eyes, they can become really fun projects that show your skills. In your employer's eyes they are a clear representation of your interests and how you can be effectively utilized within a company.

Friday 4 March 2011

Xbox Kinect

I've always been very interested in Natural User Interfaces. I've frequented the NUI Group, and have studied various ways people have created and used NUIs.

Recently, I've purchased the Kinect, and anyone that is interested in playing around with it before Microsoft releases it's Drivers and SDK, there is a great tutorial over at Brekel to set things up with temporary drivers. Anyway, I've been very interested following people's Kinect projects as well as Microsoft's development. I've talked to professors about doing a course on NUIs and it appears they may try it next semester.

I've been planning on working on a interesting but simple project with the Kinnect. So I thought I'd outline an idea I have here...

I think this is a great idea, and if it's not being worked on, I'm sure it will be done soon. I want to create a NUI to a Media player. I've chosen a media player, because the controls are quite simple, and also because it is something that people don't necessarily want to use an interface to use. Let me paint you a picture.

You are watching your TV, there is a knock at the door/phone rings/gotta pee, you get up and deal with it. Because you got up and walked away, your TV pauses it's program. You return, and wave your hand and it starts again, rewinding 3 seconds.

You didn't need to find the pause button, you didn't need to rewind it when you got back, you just got up, left, returned and watched.

I intend to do quite a bit of research on what people think is "Natural". I'd also like to include American Sign Language.

Controls would include:
  • Play
  • Pause
  • Stop
  • Fast Forward
  • Rewind
  • Skip Forward
  • Skip Back
  • Volume
 I would possibly include a menu and navigation system, as well as file navigation, but for the ground work it would be tailored to the simple media controls.

NUIs definitely has quite a future a head of it, and I intend to get on the wagon early. I highly recommend any student that has even the slightest interest in creating a NUI should pick up the Kinect, especially once Microsoft releases it's Drivers/SDK this spring.

Have fun!

Wednesday 23 February 2011

Starting a website portfolio

As a computer science student looking for an internship, I decided I needed to create a website. My friends urged me to create a very simple website to get my name out there and then expand on it as time goes on. While I agree with them and would give the same advice to others, I knew I would prefer to start with something visually and technically interesting to me. My website is located at www.mathieu-lessard.com

I've been working with jQuery a bunch lately and would highly recommend it to anyone that would like to make their website a little more interesting. My website is some javascript, images, css, and one html file. Anyone that knows how to take the code is welcome to it. However it doesn't come with documentation on how to add and remove pages...


So, what should a portfolio website contain? I think there are key areas that need to be addressed on your website.

1. Who are you?
On your website portfolio, tells us who you are. Are you married? Do you like to cook?
Prospective employers want to know about you, so let them know who you are. Make sure all areas of your portfolio radiates the type of person you are.

2. Your experiences.
Have you gone to school? What did you study? Do you have any projects you've worked on? What were they?
Your portfolio should show what you are capable of. A great way of showing what you are capable of is showing what you have done. Show projects that you loved working on. I say "loved" because if you show projects you've done but despised, you are advertising yourself for a position you don't want.

3.Endorsements/References.
If you have comments from people about your work, try and display them. If you've received a thank you card for volunteer work, show it. If you've won a competition, we want to see it!

4. What do you do?
This is a very important area. Employers want to know what you do. Do you work on techie-projects on the weekend? Do you contribute to some open source software? Do you edit wikipedia?
It's very important to talk about what you've done and what you've studied, but a great example of what you are learning is what you are doing. Try and have some project that you work on for fun. For me, I've been very interested in Natural User Interfaces and will eventually post some stuff about that.


Best of luck! Remember, after you've made a website portfolio, use it! Get feedback on it, fix it, use it again. Good luck finding those positions!