In this series of posts, I'll try to cover and document some of my early explorations in computer vision and how far we have come. In this initial post, I'll focus on hand tracking with a port of this model to JS.

Problem Statement

The problem of hand tracking in computer vision is fairly easy to state: given an image, draw boxes whenever there are hands. For example, in the following picture there is a clear hand raised by this person.

Training

I found this neat pre-trained model here and gave it a try. Converting it to the tensorflow web format was fairly straightforward:

    
tensorflowjs_converter 
    --input_format=tf_frozen_model 
    --output_node_names='detection_boxes','detection_scores','detection_classes','num_detections' 
    --saved_model_tags=serve hand_inference_graph/frozen_inference_graph.pb 
    hands

You can find the tool's output model here.

Inference

With a trained model, tensorflow makes loading it extremely easy:

      
        const MODEL_URL = `/static/hands/tensorflowjs_model.pb`;
        const WEIGHTS_URL = `/static/hands/weights_manifest.json`;
        const {loadFrozenModel} = tf;
        const model = loadFrozenModel(MODEL_URL, WEIGHTS_URL);

With a little bit of UI code around it, making the inferecence is simple too:

Live demo

Surprisingly, this model in tensorflow js performs reasonably well. Here is an example of it running against your webcamera feed:

Pretty cool, isn't it?

Hand Tracking in JS

Problem Statement

Training

Inference

Live demo