Documentation

1 Introduction

The CSIRO Face Analysis SDK contains a number of useful components that can extract and utilise the geometry of the face found in video. The SDK includes a real time non-rigid face tracker and an expression transfer module that can animate an avatar using the expression of a user.

The software development kit (SDK) consists of a collection of command line programs that cater for the common use cases and an application programming interface (API) to accommodate third party applications.

2 Components

2.1 Non-rigid Face Registration

The current implementation fits a deformable 3D model to pixels using an improved version of the Deformable Model Fitting by Regularized Landmark Mean-Shift algorithm. This algorithm returns 66 2D image landmarks, their corresponding position in 3D as well as the pose of the head for each successful detection. The implementation also includes a failure detection component in order to improve robustness.

2.2 Expression Transfer

The expression transfer component is capable of transferring the shape and appearance of an individual to an avatar. The algorithm performs this transfer using a semantic mapping in order to preserve the geometric identity of the avatar. This strategy resulted in more visually appealing animations when compared with animations produced using a direct geometric transfer i.e. the avatar's shape is identical to that of the user.

The only information required to initialise the semantic mapping is a sample of the user displaying a netural expression. This sample can be easily obtained at run-time using the non-rigid face tracker component.

3 Building and Installation

The SDK requires the following software to be installed in order to build and execute:

  • OpenCV version 2.4 or above (See OpenCV build options for recommended settings).
  • CMake version 2.8 or above.
  • FFMPEG version 1.0.0 or above.
  • Bash
  • Qt version 4.7 or above (only required if building the GUI)

Building the software requires some familiarity with the Unix command line. Instructions for Microsoft platforms will be provided in a future release of the SDK.

The first step to building the SDK is to download the source code from the CI2CV website. The source code is provided as an archive and can be extracted using the following command

tar zxvf csiro-face-analysis-sdk.tar.gz

The build process for the SDK requires knowing the paths to certain libraries, programs and header files. Discovering this information is performed by CMake using the following commands.

cd csiro-face-analysis-sdk
mkdir build
cd build
cmake [options] ..

The SDK includes a demonstration program of the face tracker and expression transfer components. This program is not built by default as it is not a critical component of the SDK and it avoids having to install the Qt GUI framework. If you wish to build this component, you must specify the option -DWITH_GUI when invoking cmake above.

The default values used by the SDK should be sufficient for most systems, however, if you experience difficulties then there are a number of [options] to cmake that aid the configuration process. Valid [options] are

-DOpenCV_PREFIX=/opencv/prefix
Installation prefix for OpenCV.
-DFFMPEG=/path/to/ffmpeg
The path to the ffmpeg executable.
-DBASH=/path/to/bash
The path to bash. (Important on systems where /bin/sh is not BASH. e.g. FreeBSD)

When CMake has successfully configured the project, issue make.

make

Once the build is completed, all command line programs are stored in the build/bin/ directory and all shared libraries are stored in the build/lib/ directory. The command line programs are executable from within the build directory. There is no need to perform make install!

The directory in which the executables are built can be added to your search path with the following command (BASH only)

export PATH=$PATH:/prefix/csiro-face-analysis-sdk/build/bin/

4 Programs

4.1 Non-Rigid Face Registration

This section outlines the non-rigid face registration program face-fit. This program can perform fitting on a single image, a sequence of images or video.

An important detail of the fitting algorithm is that it relies on a frontal face detector to initialize the non-rigid fitting component. Once initialized, it falls back to the frontal face detector only when the fitting algorithm has failed to accurately perform non-rigid registration.

The following command executes the fitting algorithm on a single image and visualises the results.

face-fit <image>

The resulting landmarks can be saved to file by specifying an output pathname as a command line argument.

face-fit <image> <output-landmarks>

The next command performs tracking over a sequence of images.

face-fit --lists <image-lists> [landmarks-list]

The argument <image-lists> is a file containing a list of image pathnames with each pathname separated by a new line. The argument [landmarks-list] is a list of pathnames to save the landmarks to. If landmarks-list is not specified, then the fitting results are displayed on the screen. Users should be aware that only successful registrations are saved to file.

The --video switch enables face-fit to perform fitting on a video.

face-fit --video <video> [landmarks-template-string]

The argument <video> is the pathname to the video. If [landmarks-template-string] is not specified, then the tracking is displayed to the screen. If [landmarks-template-string] is specified, then it is used as the template (or format) argument to sprintf(3) in order to synthesise a landmark pathname based on the frame number.

For example, the following command will write a landmarks file at frames/frame000001.pts for frame one of video, frames/frame000002.pts for frame 2, and so on for each frame in video.

face-fit --video video frames/frame%06.pts

Like the other modes, landmarks are only written if tracking was successful.

More functionality of the face-fit algorithm can be obtained from its usage text.

$ face-fit --help

4.2 Expression Transfer

This section illustrates the expression-transfer program which is a front end to the SDK's expression transfer API.

The command line arguments accepted by the expression-transfer program are

expression-transfer [options] \
                    <calibration-image> <calibration-landmarks> \
                    <image-argument> <landmarks-argument> <output-argument>

The arguments <calibration-image> and <calibration-landmarks> represent the data needed to calibrate the semantic mapping between the individual and the chosen avatar. The calibration data must be an exemplar of the individual displaying a neutral expression.

The arguments <image-argument> and <landmarks-argument> represent the expression to be transferred to the avatar and the argument <output-argument> specifies where to save the rendered avatar. How this information is interpreted changes depending on the mode of the expression-transfer program.

The default mode is to synthesize a single image of an avatar and save it to file.

expression-transfer calibration.png calibration.pts \
                    input.png input.pts output.png

If you specify the switch --lists, the arguments <image-argument>, <landmarks-argument> and <output-argument> now correspond to lists of pathnames.

The avatar used in the above examples is the default avatar delivered with the SDK. Other avatars can be selected using the options --index and --model.

expression-transfer [--model <model-pathname>] [--index <index>] ...

Viewing or choosing an avatar for the expression-transfer program can be performed using the program display-avatar.

display-avatar [model-pathname]

If [model-pathname] is not specified, then the default model pathname is used.

You can change avatars by pressing the a and d characters keyboard. The left and right arrow keys can be used as well. When the avatar changes, a number will be printed to the console. This number can be used as the argument to the --index option for the expression-transfer program.

4.3 Creating New Avatars

The program create-avatar-model is used to create a new model file that can be used by the Face Analysis SDK.

create-avatar-model [options] <output-model-pathname> \
                    <avatar-image> <avatar-annotation> [eyes-annotation]

The argument <output-model-pathname> is the location of the new avatar model containing the avatar defined by the arguments <avatar-image>, <avatar-annotation> and [eyes annotation].

The following diagram displays the landmarks that should be stored in the <avatar-annotation> pathname using the points file format.

avatar-annotation.png

The [eyes annotation] argument provides the ability to draw the eyes of the avatar with the same gaze as the user. This pathname should contain the following annotations using the points file format.

avatar-eyes-annotation.png

It is safe to not specify the [eyes-annotation] argument for cases where the avatars are wearing glasses.

The create-avatar-model program provides a --list switch to allow the creation of an model file containing more than one avatar. In --list mode, the arguments <avatar-image>, <avatar-annotation> and [eyes-annotation] are files containing lists of pathnames.

4.4 Demonstration Program

A GUI application, called demo-application is included with the software which demonstrates the tracker and expression transfer components simultaneously.

demo-application.png

The camera used can be changed using the --camera-index command line option

demo-application --camera-index <index>

where the argument <index> selects the camera to use. The order of the cameras is determined by OpenCV and the first camera has an index of 0.

On OSX, a drag and drop installer for the application can be built by issuing the following in the <build> directory.

cpack -G DragNDrop -DWITH_GUI=yes -DCPACK_BUNDLE_NAME=DemoApplication

The above command can only be executed after the SDK has been built.

5 Application Programming Interface

This section outlines how to integrate the CSIRO SDK in to third party applications.

5.1 Non-Rigid Face Registration

The non-rigid registration algorithm can be used in third party C++ applications by including the FACETRACKER namespace.

#include <tracker/FaceTracker.hpp>

The tracking interface is provided by the abstract base class FaceTracker. Instantiating a new instance of this class is performed by the LoadFaceTracker function

FaceTracker *LoadFaceTracker();

This function returns NULL if the face tracker cannot be loaded.

Alongside the FaceTracker instance are its parameters. Face tracker parameters are represented by the opaque data type FaceTrackerParams of which a new instance can be obtained with the function LoadFaceTrackerParams.

FaceTrackerParams *LoadFaceTrackerParams();

This function returns NULL if the tracker parameters cannot be loaded.

The methods implemented by the FaceTracker class are as follows

typedef std::vector<cv::Point_<double> > PointVector;

class FaceTracker
{
public:
  virtual int NewFrame(const cv::Mat_<uint8_t> &image, FaceTrackerParams *params) = 0;
  virtual PointVector getShape() const = 0;
  virtual void Reset() = 0;
};

The method NewFrame performs tracking on a grayscale image using the tracking parameters params. Its return value is an integer value between 0 and 10 (inclusive) or one of the constants

FaceTracker::TRACKER_FAILED
The tracker has failed to accurately perform registration.
FaceTracker::TRACKER_FACE_OUT_OF_FRAME
The tracker has failed as the face is partially outside the image frame.

A value between 0 and 10 represents the health of the tracker. A value of 10 indicates that the quality of the tracking is very good, and a value of 0 indicates that the tracking quality is poor.

When the tracking quality is poor or the tracker has failed, an application must reset the tracker using the Reset method.

5.2 Expression Transfer

The expression transfer algorithm can be used in C++ applications by including the AVATAR namespace.

#include "avatar/Avatar.hpp"

The interface used to perform expression transfer is provided by the class Avatar.

typedef cv::Mat_<cv::Vec<uint8_t,3> > BGRImage;
typedef std::vector<cv::Point_<double> > PointVector;

class Avatar
{
public:
  // Expression Transfer
  virtual void Initialise(const BGRImage &im, const PointVector &shape, void* params=NULL)=0;
  virtual int Animate(BGRImage &draw, const BGRImage &image, const PointVector &shape, void* params=NULL)=0;

  // Selecting the avatar.
  virtual int numberOfAvatars() = 0;
  virtual void setAvatar(int index) = 0;
};

An instance of the Avatar class can be created with the function LoadAvatar().

Avatar *LoadAvatar();
Avatar *LoadAvatar(const char *avatar_collection_pathname);

The method Animate renders an avatar displaying the expression found in image with the corresponding shape. The resulting rendering is stored in the matrix draw.

Prior to calling Animate, an avatar must have been chosen using setAvatar. With the avatar chosen, the Avatar instance must be initialised using Initialise. Initialisation requires an image and shape corresponding to the netural expression of the individual that is being used to animate the avatar. This procedure must be followed every time the avatar is changed.

5.3 Points

The utils/points.hpp header contains two functions for reading and writing point files.

typedef const std::vector<cv::Point_<double> > PointVector;

PointVector load_points(const char *pathname);
void save_points(const char *pathname, const PointVector &points);

A std::runtime_exception is thrown if either function is unable to perform its task.

5.4 Including and Linking

The compiler options required to use the code outlined in this section are the following

  • <source>/src added to the include path.
  • <build>/src added to the include path.
  • Linking against the libraries utilities, clmTracker and avatarAnim in the <build>/lib directory.
  • Compiler and linker requirements for the OpenCV modules core, highgui, imgproc and objdetect.

6 File Formats

6.1 Points

The programs used in this library make extensive use of point files or landmark files. These files commonly have the extension .pts. The format of this file is intended to be very simple.

This is an example of a points file:

n_points: 2
{
1 2
5.5 2.2
}

The first line of a points file contains the number of points N in the file. The region between the braces { and } contains the N points with each point starting on a new line. The text for the point is simply two floating point numbers.

7 Utilities

This section outlines a number of utility programs which are bundled with the software.

7.1 Mapping lists

The command line programs in this SDK follow this basic argument structure

command [options] <configuration-1> .. <configuration-K> \
                  <input-pathname-1> .. <input-pathname-M> \
                  [output-pathname-1] .. [output-pathname-N]

The reason for this is that this structure makes it very convenient to operate with lists of data when coupled with the map-list program.

Lets assume that the file a.list contains a list of numbers

1
2
3
4

and the file b.list contains a list of strings

do
not
pass
go

then the command map-list 2 a.list b.list echo will produce the following output

1 do
2 not
3 pass
4 go

The usage string for map-list is

map-list [options] <N> <list-1> .. <list-N> <command> [command arguments ... ]

The argument <N> specifies how many lists are specified on the command line. The <N> lists must immediately follow. The argument <command> represents the command to be executed, and [command arguments] will appear on the command line before the items obtained from the lists.

Another example using the above data is

$ map-list 2 a.list b.list printf 'file-%02d-%s.txt\n'
file-01-do.txt
file-02-not.txt
file-03-pass.txt
file-04-go.txt

If one of the list arguments is the text -, then the list is read from the standard input rather than being read from file. It is safe to use - multiple times. This indicates that the list read from standard input is used more than once.

7.2 Pathnames

Complementing the map-list program is change-pathnames. Its purpose is to take a list of pathnames and create a new list with the pathnames changed to have either a different directory, extension or both.

For example, the file input.list contains the following list of pathnames

frame-01.png
frame-02.png
frame-03.png
frame-04.png

Executing the following change-pathnames command on input.list

change-pathnames input.list output.list --directory points/ --type pts

produces the file output.list

points/frame-01.pts
points/frame-02.pts
points/frame-03.pts
points/frame-04.pts

If the input list for change-pathnames is the character -, then the list of pathnames is read from the standard input. If the output list is the character -, then the transformed list is written to standard output.

7.3 Video

A number of utilities are included in the SDK that perform common operations on video. These utilities are

  • remove-rotation-metadata
  • rotate-movie
  • extract-frames-from-movie
  • create-movie-from-frames

The program remove-rotation-metadata is required to overcome an issue with OpenCV where its cv::VideoCapture class does not honour the rotation parameter embedded in some video containers. This problem typically occurs when working with video obtained using a portable device. The program remove-rotation-metadata creates a new movie without the rotation parameter.

It may be required to rotate the video once the rotation metadata is removed. This task can be performed using the command rotate-movie.

The program extract-frames-from-movie converts a movie to a sequence of images and create-movie-from-frames uses a sequence of images to create a movie.

All of the above programs simply invoke FFMPEG with the required options and arguments.

8 OpenCV Build Options

It is strongly recommended that the following options are used when building OpenCV

cmake -DCMAKE_BUILD_TYPE=Release \
      -DENABLE_AVX=ON \
      -DENABLE_FAST_MATH=ON \
      -DENABLE_SSE=ON \
      -DENABLE_SSE2=ON \
      -DENABLE_SSE3=ON \
      -DENABLE_SSE41=ON \
      -DENABLE_SSE42=ON \
      -DENABLE_SSSE3=ON \
      /path/to/opencv/

Please ensure that your CPU supports the specified instructions before enabling them otherwise the compiler will produce binaries that cannot be executed.