A few weeks ago I discussed Vitamin D Video, the software system which acts like the human brain to find people and objects of interest in a recording. Vitamin D takes hours of tedious camera footage and reduces it to the few minutes or moments of interest that you want to see. An independent offshoot of Numenta, Vitamin D just launched its free beta release today. I was able to get my hands on a copy of the program, and wow. This is some powerful code and it could make a big splash when the final version debuts in the first half of 2010. Check out the new demonstration video here, or watch the old demonstration video after the break.
To give you some background, Vitamin D Video uses Hierarchical Temporal Memory architecture first developed by Numenta to help recognize objects of importance even if they are moving, clipped, or otherwise complicated. The beta works with IP cameras or webcams, with user defined filters that you design through a simple interface. People are highlighted in yellow boxes, objects in green, and recording can be triggered as doors open, things move into a space, etc. If successful, VDV could improve the use of security cameras world over, as a computer could be used to actively monitor and sift through video in place of costly (and easily bored) humans. Vitamin D hopes that it will become the cheap, easy, but also sophisticated alternative to the more costly video recognition software already on the market.
While there are enough kinks to demonstrate that it is obviously still in beta, overall Vitamin D Video runs well. I had the software downloaded, calibrated, and running in less than 10 minutes start to finish. Using just my webcam, I was able to test VDV’s object and person recognition scanning and its email and sound notification. Basically, I ran the camera for 3 hours and told VDV to record anytime I walked in the room, or when an object touched my ceiling. These filters were fairly easy to build on my own, though I did have to consult the Vitamin D reference guide here and there.
The crazy thing was, while VDV is built as a means to reduce the amount of video you have to sift through, my filters didn’t work in that manner. Oh, there was some reduction, I went from three hours of footage to a highlight reel that was about one hour long, but’s still too much to watch quickly. Passing through the selected clips at 3x speed helped, but not much. The problem was that my filters were too broad and triggered too easily. That isn’t VDV’s fault, but it does point out an issue with this program. While it is definitely easy to get started, it will take a good period of trial and error before a user feels confident that they are only going to record what they want.
As for the details of my personal experience…well let me preface this by saying that I was giddy getting to play with a new toy. I experimented with different lighting levels, object transparency, and speed of travel. VDV can track a black plastic hair comb as it hits a white ceiling. It isn’t fooled if you walk in backwards or with your shirt pulled over your head. It can even notice if a clear piece of plastic is slowly brought into frame (though it may have triggered off the attached piece of string). Oh, and I can verify that a grown human running at full speed and leaping through the air will be detected. And slightly injured.
There were some limits. The software had trouble triggering when an object was in front of an overexposed white background in strong light. VDV couldn’t always identify objects as objects and people as people (my hair, judging by the green box, is only sometimes human). It often assumed that a hand holding an object was part of the object. All of these limitations may have easy software fixes (I’m still reading through the guide and testing the program to see) but they have even easier user fixes: don’t point your camera at a light source, and take extra care when designing your filters. Ultimately, unless you’re looking for a polar bear in a blizzard I think Vitamin D Video can probably work just fine for any personal or small business applications you can dream up.
Also, why these aren’t limitations exactly, I was instantly annoyed by the audio and email notifications. Again, my triggers were too sensitive, but hearing “person detected…recording video” every five seconds drove me homicidal very quickly. You can use custom .wav files for audio notifications. Hearing Christopher Walken say hello when you walk into a room is fun, but still annoying after 20 triggers. The email system was the same…but with email. I’m sure you can imagine.
Still, I’m happy to report that everything in the beta seems to work, if a bit slowly. VDV takes a long time to load as it organizes your clips for you. I once thought my computer had frozen, but with Vista, that’s pretty much par for the course. VDV also wants to use 25 gigabytes to store data, but I think that’s to be expected considering its function. If you’d like to play with the new software, try signing up for the beta and let me know what you think. For now, I’m going back to testing. I wonder if VDV will recognize a mannequin as a human being.
Photo and video credit: Vitamin D Video