Entries tagged as 'movid'

This page lists all postings that have been tagged with the chosen tag.

At the end of 2010 I went to Paris (France) for an internship at a local company that produces multitouch hardware. I chose the job myself because I thought it would be interesting. What I implemented was markerless object recognition & improved tracking for the computer vision framework we’re developing (movid.org).

The video below shows the results of the prototype. The program is able to recognize objects based on their shape and size and does not need additional fiducial markers. It also takes into account object rotation (as long as you don’t have a perfectly circular object), even for square objects. You get an angle between 0 and 360 degrees.

The demo runs on an LLP setup. Only lasers, no diffuse illumination or similar approaches added.

The program that you actually see is just a visualization of the recognition and tracking, written in PyMT.

The quality of the video suffers from the fact that we had only 10 minutes to capture it before the table was transferred to an exhibition. The calibration was just quick and dirty, which is why I had to press that button to register an object (on the bottom) with a mouse instead of touching it.

What you see is a WIP project prototype. The code can be found in the GitHub repository on the master branch.

Markerless Object Recognition & Tracking (Movid) from Christopher Denter on Vimeo.

3 comments Jan 26, 2011 8:45:00 PM c++, movid, multi-touch, nerdstuff, planet-pymt, planet-python, planet-ubuntu, pymt, python, technology

Google’s Summer of Code 2010 comes to an end for me today. It has been a great time working on awesome projects like PyMT and Movid. My task was to enhance PyMT’s text input methods. One of the joys of this task was that it allowed me to work on a relatively wide scope of things. Here’s a brief list of what I worked on:

  • I added a new spelling provider to PyMT that abstracts from individual spellchecking libraries. That means you can use your favorite spellchecking library, which is important considering that PyMT is cross-platform.
  • I added two actual spelling providers that implement this protocol: One enchant spelling provider (usable after installing enchant) and one provider using OS X’s native AppKit spellcheckers (so you get that out of the box on OS X).
  • Mathieu once wrote a basic virtual keyboard with spelling suggestions which I adapted, cleaned and merged.
  • PyMT obviously already had some text input widgets, which I improved (e.g. MTTextArea).
  • I began working on a version of MTTextInput with added spellchecking (like OO.org with red lines drawn for incorrectly spelled words), but that needs some more love.
  • One of the more concrete objectives of my task was a Swype-like keyboard for PyMT. I created a prototype for that, see the video below.
  • Another concrete objective was a split keyboard (split into two parts, one half for the left, one for the right hand) that adjusts to your hand’s properties (e.g. size). To achieve this, a substantial amount of changes was needed to our vision tracking application (Movid):
    • For the keyboard to adjust to the user’s hands, a handtracking algorithm was needed that I implemented for Movid. It detects the fingertips of the hand as well as the hand’s center. These are just seen as a certain type of ‘blobs’ internally.
    • These blobs need to be tracked over a sequence of frames from the camera. Additionally, we also want to find simple touches (without all the hand information). For that, I added and integrated BlobFinder and BlobTracker modules that obey a common format so they’re easily interchangable.
    • When your camera senses a blob on the touch surface, the application needs to perform a mapping to get the blob into screen coordinates. We do that using a calibration module, which I had started before SoC. I finished it and merged it back into our master branch.
    • As an extra feature, I added a PyMT module that you can use to calibrate your tracker from within your client application, eliminating the need to switch applications. I also added a Flash GUI for the calibration so that you can easily do it on any remote computer via our web interface.
    • To actually send the handtracking to the client application, Mathieu added a TUIO2 module to Movid. I started a PyMT input provider for TUIO2. Both of which is work in progress, but I believe we’re the first project to adapt TUIO2 (there’s not even a reference implementation yet).
    • The result of that can be seen in the second video below. Also, make sure to read the vimeo description.
  • Other than that we now also provide portable binary packages for PyMT 0.5 for both Windows and OSX. I created the OSX package, so it’s no longer a major pain to install. You just download and run it.
  • And, of course, many more fixes!

Some of that is already in PyMT 0.5. All of the Movid stuff will be in the first release. In future releases we shall see much improved versions of these prototypes and hopefully even context aware word suggestions.

Here are the two promised videos, if you’re reading this through a planet, please go directly to my blog.

Prototype WipeToType Keyboard for PyMT from Christopher Denter on Vimeo.

Ergonomic multitouch keyboard prototype from Christopher Denter on Vimeo.

Thanks to all the people who made this possible. Thanks Google, Christian, Pawel, Mathieu and Thomas, for being (a) fantastic mentor(s). It has been a great pleasure and privilege to work with you in GSoC 2010 and I sure will continue to work on both projects.

3 comments Aug 16, 2010 7:31:00 PM gsoc, movid, multi-touch, nerdstuff, planet-pymt, planet-python, planet-ubuntu, pymt, python, technology, text input

I just had the opportunity to take a video of my multitouch table with my software in action. Both hardware and software were built for my bachelor’s thesis which I handed in in march. The software that you see at the end is written in Python with PyMT, using the VTK library.

Medical Multitouch from Christopher Denter on Vimeo.

Reading through a planet? Click here!

For more information, see the video description. PS: Although it supports all platforms, it currently runs on ubuntu. :-)

Let me know what you think!

9 comments Jul 13, 2010 1:28:13 AM hci, movid, multi-touch, planet-pymt, planet-python, planet-ubuntu, pymt, python, technology

The NUIGroup Google Summer of Code students (I was lucky enough to become one of them for PyMT this year) are asked to summarize their weekly activities in blog format. Given that the first week has passed I figured I should just quickly outline what I have been working on up to now.

My proposal aims at developing more advanced text input methods for PyMT.

Work on PyMT

Some of the ideas I will realize draw heavily upon spelling correction and suggestion. It is therefore necessary that PyMT can interact with a spelling backend. Given that PyMT should be kept modular, I first implemented an abstract new core provider for spelling suggestions to become independent of a specific library. I then realized two concrete implementations of this provider:

  • An enchant spelling backend. This uses the enchant spelling library which can itself be used with different kinds of dictionaries.
  • A spelling backend based on OSX’s AppKit spell checker.

After the foundation was laid out I adapted a virtual keyboard with spelling support that Mathieu once developed to the new API and added it to the code base. All of this is not yet finished and needs some more love before I can merge it back into the master branch. You can check the branch I’m currently working on here.

PyMT Virtual Keyboard with spell checking

Work on Movid

While spellchecking is important for some of my upcoming widgets, some other text input approaches make use of additional information provided by the tracking application. For example, one idea I had was to split the keyboard in half and dedicate one half to each hand. The halves would then automatically orient themselves following the respective hand’s position and orientation. Theoretically, further information such as properties of the user’s hands (length of fingers, etc.) could be taken into account to lay out the keyboards. For this I obviously need some kind of hand and fingertip tracking. Luckily I implemented that for Movid already:

Movid Hand Tracking

However, since Movid is still not ready for end users due to a missing calibration utility and a proper (generic!) blob tracker (which means I can’t use it yet either), I continued my work on both of those. Again, both of which are not finished, but I can see the light at the end of the tunnel (or rather, the light below my fingers):

Movid Calibration Prototype

I hope that we can finish all of this and push out a first version of Movid for end users soon. And obviously, I want to test my text input widgets on my multitouch table and not in the mouse simulator.

This concludes my work for week one. If you have any questions or are interested in PyMT or Movid, feel free to join our IRC channel at #pymt and #movid on irc.freenode.net.

3 comments May 31, 2010 1:01:00 AM c++, coding, gsoc, hci, movid, multi-touch, nerdstuff, opensource, planet-pymt, planet-ubuntu, pymt, technology, vision

Hi everyone, I am glad to announce the birth of the Movid project: movid.org

Movid is an acronym; it stands for ‘Modular Open Vision Interaction Daemon’. It’s a cross-platform and Open Source vision tracker, designed to be as modular as possible. Although the project is pretty young, it already features more than 20 modules, including blob and fiducial trackers as well as TUIO output. Movid is coded in C++, and use WOscLIB, cJSON, libevent, libfidtrack, jpeg-8 and XgetOpt.

Movid has several key characteristics:

  • Cross-platform: It works under Windows, Linux and MacOSX.
  • Daemon: You can run the program without a GUI and control it from another computer over the network.
  • Threading: Each module can be run inside a thread. This means that you can finally fully utilize your multi-core processor!
  • Remote API: The daemon can be controlled with a JSON API. This also means that you can write your own GUI, e.g. in Flash, and the daemon can be controlled from any application that can make http requests!
  • Full HTML5 embedded administration: By default, the daemon acts as a HTTP server. You can control and modify the tracking pipeline in real-time and adjust many parameters.
  • Image streaming: Most modules process images. For your application or GUI, you can get the output image via a stream. So your applications can show any image from the piepline or use it for advanced features
  • Flexible pipeline: Unlike other applications, Movid allows you to fine-tune your image processing pipeline if you are an expert. You can create new pipelines, add modules/filters and change their parameters in real time.

However, Movid is not ready for users yet, since we are missing a few modules, like calibration. Right now, we are searching developers to support us with the further development.

More info:

7 comments Apr 19, 2010 10:31:00 PM hci, movid, multi-touch, nerdstuff, opensource, planet-pymt, planet-ubuntu, technology, vision