Description:

Disco Cube is a dancing visualization platform that shares old users' dance while the current user is dancing in a kinetic way. All the motions are presented in a flexible point cloud particle system. The motion of the current user will take partial control over the recorded old users' motion, which creates a leading and following experience.

Collaborators: Eric Li, Mars Zhao

Demo Video:

Concept Proposal:

Link to our old post (of previous ideas)

Technologies & Materials:

Processing and Libraries
Minim (for audio visualization)
KinectPV2 (for utilizing Kinect skeleton tracking)
oscP5 (for transferring gestures between server and client)
Spout/Syphon (for streaming to Madmapper)
Kinect SDK for Windows
Madmapper (for projection mapping)
Three projectors
One wood piece as the projection screen (made from three IDEA Lack side table)
One powerful computer running Windows

Development:

Source code in Github: https://github.com/WenheLI/Lead-Follow

There are some important stages in our project in chronological order:

Deciding the visuals
Recording the point cloud (Server Side)
Tracking certain gestures (Server Side)
De-noising Skeleton Tracking (Server Side)
Communications between Server and Client
Building Installation
Adding and visualizing music (Server Side)
Adding instructions
Deciding the visuals

The visual style was the very first thing we decided to achieve for this project — a cubic placement of projection screens with point cloud people imprisoned. The project concept was a dark-toned "Prison Cube" where the user is a god-like figure controlling the old users, which turned out very difficult to deliver.

The idea of the placement didn't change much. The top two are of the Client side and the bottom one is of the Server side as we consider it in this project. As we went on developing visuals for the point cloud, we found that a club-like vibe was what we could deliver and we really enjoyed it. That was when we started considering turning it into a "Disco Cube" as it turns out to be in the end.

Recording the point cloud (Server Side)

We strongly feel that the recording of the point cloud data is the fundamental function we need for our project. We used the Java JSON Library to store the vertices of every frame in a JSON file. The range of the XYZ value is also recorded to map the reading value into a proper range.

To read the JSON file, we created a JSONPointCloud class to do the job. The point cloud particle system was inherited from the previous assignment.

Animating the point cloud data (Client Side)

By animating, it doesn't just mean to play them back — it means to make them respond to certain trigger with a specific animation. The Animator class we created handle triggers, store the array of every keyframe and calculate the animated value of the next frame smoothly by using lerp().

De-noising skeleton tracking and tracking certain gestures (Server Side)

We found that when there are multiple people in the frame, it is hard to locate one person that is really interacting with the experience. Our de-noising logic is to detect the longest spine length in the frame and set that as the default player. Moreover, to filter people in the far back, we also set a threshold to let the algorithm ignore people with relatively short spines in the frame. Finally, we also consider the scenario where the user is playing while his/her friend is watching from the same distance, which makes their spines similar in length. To conquer this, we make comparisons for all the spines in a flexible range around the threshold and pick the one that is most in the center as the default one.

We divide gestures into two categories — the continuous and the individual. The continuous ones are like "waving hands" or "moving steps" that are constantly making gestures every frame but only with different amplitudes. The individual gestures will only be trigger when the threshold amplitude is met, which are like "holding your hand on your head" that we used for starting the experience.

Communications between Server and Client

JSON Files: To save the efforts of developing, we decided to run the server and the client on the same computer. In this way, the communications of large JSON files can be easily solved by writing and listening to the same folder on the computer.

Gestures: Gestures are sent via oscP5 to apply animations from the server side to the client side. The gestures include: waving hands to control particle acceleration, moving around to control position offset and shaking head to change color palette.

Receive, decode and apply the animation

Building Installation

First, we got two IKEA Lack side tables to modify. We bought some connecting metal parts and use them to fix the table together in the way we want. The mechanism was perfect for projection mapping.

Then we added one more table at the bottom.

Moreover, since the tables are pure white and our project is like a Disco Style thus we want to add some decoration on the installation like painting. So we get some spray paint and start to decorate on the installation and we painted them. This is a video documentation of have we do the painting job. We spent a lot of time to figure out how to deliver the best color pattern and painting skills on the installation. Since the faces that our installation needs to be decorated are quite dark, we need to use white paint as the base color and paint what color we want on the white color. Otherwise, our color may not be present so well due to the dark color.

After the installation, what we need to do next is to set up the projectors. Since we have three projectors, we need to arrange the whole three projectors appropriately. In this case, we need to use some stands to hold the Kinect and projectors.

This is our setting in the studio. As we can see our two projectors does not support the tripod so we use some simple material to act as a base for the projectors. And it may have a better performance if we can use laser cut to cut some customized bases. And one more thing is, the installation is actually composed of two parts. One is the two vertical tables and another part is the horizontal table. In this way, we can easily set up and transfer the installation.

Adding and visualizing music (Server Side)

We wanted to generate music based on the motions, but our time didn't allow us to do that. We added "Broadway Boogie Woogie" — a song by Ryuichi Sakamoto inspired by the famous Mondrian painting. We imported the Minim library to play the audio piece and applied FFT to translate the audio into amplitudes. We then use the data to generate a dynamic visual based on our particle system. For the demo video, please go to the top and check out the project demo with audio.

What I Have Done:

JSON file:

For the JSON file, we use the Java default library to encode and decode JSON file. Below is the JSON structure.

{
"x":Int,
"y":Int,
"z":Int,
"XRange":[Xmin, Xmax],
"YRange":[Ymin, Ymax],
"ZRange":[Zmin, Zmax]
}

For this JSON design, we made some tradeoffs. First, we don't need to use float to store every point, it takes much more time to read them and put them in memory. Instead, we can use Int to store this points, from our experience these methods will save at most half of space in memory. Another thing is we add the range data, this is for mapping the figures into the box and prevent them from being out of the box.

One more reason for choosing JSON file is we require a large number of data flowing from Kinect and computer. Furthermore, we will do a real-time computation animation. It means our framerate can be really low once we choose to read the Kinect data directly and it also requires much memory to store the point cloud.

Animator Class:

I design the structure of the Animator class. Basically, the logic for the class is to have a chain of setting and calling animation. Below is the source code for this class and it basically follows the logic of Android animation.

OSC communication:

After we define the JSON file and animator class, we start to work on the communications between server side and client side. Due to the large JSON file, we think that we should avoid sending messages among different computers instead we will use it to send messages among different programs. And since I have done some socket programming and backend development, this process is not so hard for me. And one more thing, we are thinking is how to let the client know when the server saves a JSON file. Firstly, we want to use a listener to monitor the folder holds JSON file. However, due to the limitation of processing, we didn't make it in this way. However, we come up with another solution that it to use OSC to send a signal to notify the client a new file has been added.

Denoise:

To denoise, firstly we want to track the closest person by the skeleton data. However, when we implement it, we found that the skeleton data has no Z data. Thus we need to calculate the distance from the neck to the spin to act as the depth. And it actually works in practice.

However, after that, we found if people just want to have a see with the project but they are not the player. Sometimes they may be misrecognized as the player since they are closer to the Kinect but they are quite off the center of Kinect. Every time when we got a collection of closest people, we will choose the central person as the player.

One more thing is that when the user starts playing and forgets stopping. It will check if the user is absent from the Kinect for 5 continuous seconds. If so, the recording will stop automatically.

Basically, the three things are mainly designed for the real show performance so that is can make it rather robust.

Moreover, we use the face tracking to track the face's roll, yaw and pitch to detect the rotation and we can further use these data to judge if user shakes the head.

Performance:

Firstly, we use the Sound library from processing. And it takes a lot of time to do the ttf and sound playing. And one significant drawback is when the music starts loop the tff doesn't work anymore. Thus we rewrite in minim and the life gets to be well.

Reflection:

Firstly, never forget to use minim when one is doing with sound.

Secondly, a good denoise solution is helpful with the performance. Before we do the denoise procedure, the skeleton tracking is not so stable. And after the denoise, the performance works perfectly.

Thirdly, design the state machine in advance. We have changed our state machine several times, and every time we just put a lot code into the state. And at the end, it is hard to maintain the whole big mess state machine.