CV4HCI Documentation
May 2, 2006
CV4HCI is a Computer Vision library adapted to the needs of Human-Computer Interaction applications. Currently, it provides a simple C++ and Java
interface to OpenCV's CAMShift and Foreground Detection algorithms. It was built for the benefit of the students taking the Human-Computer
Interaction course given by Professor Jeremy Cooperstock at McGill University:
http://cim.mcgill.ca/~jer/courses/hci/.
Getting Started
CV4HCI has been tested under Linux (Fedora Core 4) and Windows 2000/XP. You will however need a few freely downloable software
tools that might not be already installed on your system:
*** VERY IMPORTANT: The current distribution of OpenCV (0.9.7 beta 5) does *NOT* work
with GCC 4.0. You will have to modify the configure script and manually change "-O3" to "-O3 -fno-strict-aliasing" before
configuring and compiling. GCC will give *NO* warning whatsoever that something is wrong (it is a bug of
GCC), but it will *NOT* work, so you have been warned. Also, if you are running under Windows, you will
need a Unix-like environment such as MinGW or Cygwin:
The difference between them is that Cygwin provides a complete Unix-like environment, whereas MinGW provides only the necessary
GNU tools (GCC, Make, GDB, etc) to develop software under Windows. It is possible to use Microsoft Visual C++ or other
compilers and runtime libraries, but the Makefile included has not been made for this. You will have to create the appropriate
Makefiles or project files.
After installing the necessary tools, edit the Makefile of CV4HCI. At the top, if they are not in your system search path, you
will need to provide the path to swig, javac, jar, javadoc, doxygen, and the java
include directories. Under Windows, you will also have to provide the path to the root directory of OpenCV, and you might have
to define the variable WIN32 in the Makefile or on the command line ("make WIN32=1"). It is needed in the event that
the Windows autodetection procedure fails. That's it. You can run "make" (Note: The executable is named
"mingw32-make" under MinGW) on this, and everything will be built. Once built, the following files should be in your
directory:
- libCV4HCI.so (Linux) or CV4HCI.dll (Windows): The C++ library containing the code from CAMShiftTracker,
ForegroundDetector, and SimpleManeuveringFilter. The documentation for these classes is found within the
Doxygen generated documentation.
- CV4HCI.jar: The Java library containing the classes and resource files from ca/mcgill/cim/sre/cv4hci/*.
The documentation for these classes is found within the Javadoc generated documentation.
The main method called on execution ("java -jar CV4HCI.jar") is found in
SampleControlFrame.
SWIG is used to provide Java with access to classes and functions from the C++ library, but it does not work the other way around.
- C++Sample (Linux) or C++Sample.exe (Windows): Compiled executable of C++Sample.cpp dependent of
libCV4HCI.so or CV4HCI.dll. It is a sample file demonstrating how to use OpenCV and the CV4HCI library in C++.
There is no installation procedure yet, so simply place these files where appropriate. In Linux, you will need to add the current
directory to the library search path ("export LD_LIBRARY_PATH=." under bash) so that CV4HCI.jar and C++Sample can
"find" libCV4HCI.so there.
A quick description of the main classes and their purposes:
- CAMShiftTracker (C++): CAMShift is a colored object tracker. This class
provides an easy high-level interface compared to the one in OpenCV.
- ForegroundDetector (C++): This segments an assumed foreground object from
a mostly static background. As above, this wraps functionality from OpenCV. If there is only one foreground object detected,
its position, size and angle are also computed automatically. Otherwise the information returned is related to the average
position, size, and angle of the multiple objects.
- SimpleManeuveringFilter (C++):
A simple Kalman maneuvering filter to easily and reasonably smooth out noise from measurements.
- CAMShiftPanel (Java):
A View/Controller panel for CAMShiftTracker with all the widgets to control and visualize the state of an instance.
- ForegroundDetectorPanel (Java):
A View/Controller panel for ForegroundDetector with all the widgets to control and visualize the state of an instance.
- VideoCapture (Java): VideoCapture is a wrapper class for the
HighGUI capture API of OpenCV. With it, you can easily capture
images from your Webcam or from video files. Note: In Windows, it is limited to what Video for Windows can do, so it will
not work with DCAM and DV cameras, among others. In Linux, it can access cameras using V4L or libdc1394 only.
- Brick,
BricksFrame,
TransformedShape (Java):
A sample application of what can be done with CAMShiftTracker. Implements a mini Bricks-like
drawing board. See Bricks, a Graspable User Interface, by George Fitzmaurice:
http://www.dgp.toronto.edu/%7Egf/Research/Graspable%20UI/GraspableResearch.htm
Usage
The usage of this library is intented to be at the API level, but to understand how the computer vision software works, run
the SampleControlFrame:
In this frame, you will see two CAMShiftPanel's and one ForegroundDetectorPanel. You can start video capture by selecting the
Control->Run command from the menu bar. If you're lucky, it will start capturing from your camera. If not, read the Technical
Issues section to receive advices on how to get your camera working. It is also a good idea to manually and correctly adjust your
camera's settings. Although automatically adjusted cameras should perform adequately, it is always better to manually control its
settings, if possible. Again refer to the Technical Issues section for solutions.
From a CAMShiftPanel, you can select a region of the image using your mouse. After releasing the mouse button, the color
pixels found within this selection are analyzed and placed in a hue (from the HSV colorspace) histogram that you can visualize
by switching the View to Histogram. The tracker will then attempt to track an object of this color. If Draw
Ellipse is checked, you will see an ellipse around the tracked object. Your camera might not be set up just right and
might not segment the colors well (see Technical Issues for solutions on how to adjust your camera's settings), or your object
might not be colorful enough. Ambient lighting might also be too low for your camera: Turn on more lights. It is also
sometimes possible to cope with such situations by adjusting the parameters. First switch the View to
BackProject, adjust the following parameters, and observe the effects:
- Vmin: The minimum V value (from the HSV colorspace) a pixel needs to have to be considered. It is wise to use bright
and fluorescent objects and set Vmin to a big value.
- Vmax: The maximum V value (from the HSV colorspace) a pixel can have to be considered. Low values can help in the case of
"bleached out" colors.
- Smin: The minimum Saturation value (from the HSV colorspace) a pixel needs to have to be considered. High values might help to
segment pastel colors out.
- Min Area: The minimum area (adding up pixel values from the "BackProject") a tracked object has to have. If
this minimum is not reached, the tracker will try to latch on to something (else) in the image.
Once good results are achieved, you can start Mini Bricks from the Toys menu, and see what kind of
interaction this system has to offer. First, you have to set the Pick-Up Area and Put-Down Area to
something that reflect the results you can read from the status bar on top when you bring the object closer to the camera
and when you move it farther back. Thus, picking up the brick is equivalent to a mouse click. Now move the brick cursor on top of
one of the shapes. Pick up the brick, and you will see the brick make a copy of that shape and stay attached to it. You can move
it around and rotate it. Once a desired position is reached, you can pick up the brick once more, and it will let go of the
shape. Also, if you have both bricks attached to the same shape, the shape can be scaled.
The ForegroundDetector is similar in results to the CAMShiftTracker, but it assumes there is only one object in front of the
camera, while the rest is static background. It needs a few seconds of static background to properly initialize after a
Reset. You can then place an object in front of the camera to see the effect. The default parameters are OK, but if you
want to try to modify them, they are a bit complicated and not detailed here. For more information, see the related paper on
this subject: Liyuan Li, Weimin Huang, Irene Y.H. Gu, and Qi Tian, "Foreground Object Detection from Videos Containing
Complex Background", ACM MM2003, http://portal.acm.org/citation.cfm?id=957017.
Once you understand the underlying computer vision concepts, you can start to look at the various samples (C++Sample.cpp,
Brick.java, BricksFrame.java, TransformedShape.java, and SampleControlFrame.java) and start to build your own applications
around the rest of the classes in the same manner.
Technical Issues
You either have problems getting your camera to capture at all, or you have problems with the settings.
Capturing Issues
In Windows, there is not much that can be done. If your camera is working in other applications, then the problem is because
HighGUI's capture API uses Video for Windows which is pretty limited. One would need to use DirectShow in order to properly
capture images from almost all cameras with drivers for Windows. CVCAM (part of OpenCV) uses DirectShow, but it is quite buggy.
Try it at your own risk. Either way, to understand how to capture images using DirectShow, see the AMCap sample of the DirectShow
SDK.
In Linux, see if you can get a V4L driver for your camera (Google is your friend). If you have a DCAM camera, install
libdc1394: http://sourceforge.net/projects/libdc1394/. In both
cases, make sure you have permission to read and write from the device files /dev/video0 for V4L and /dev/video1394/0 for
libdc1394. If you have a DV, analog or USB camera that does not have a V4L driver, you will need to produce
code to capture from such devices.
Camera Settings Issues
In Windows, with a good camera driver, properties can usually be accessed and corresponding dialog boxes can be opened
from DirectShow. One can easily open these dialog boxes within AMCap, a sample video capture application, part of the
DirectShow SDK.
In Linux, for DCAM cameras, Coriander
http://damien.douxchamps.net/ieee1394/coriander/
can be used to adjust camera settings. There is no standard interface for V4L, but the most popular Webcams use the PWC
driver, and settings for those Webcams can be modified using setpwc
http://www.vanheusden.com/setpwc/.
There are other drivers and applications out there to explore... as usual, Google is your friend. Good luck. After all, we
are at the frontier of Human-Computer Interaction technologies, and video cameras are not commodities in software yet. ;)
Provided under the BSD license found in the file LICENSE.
CV4HCI - Copyright (C) 2006 Samuel Audet <saudet@cim.mcgill.ca>
The author was funded by a postgraduate scholarship from NSERC. This funding is gratefully acknowledged.