Extract Info from Human Walking Data with Dynamic Mode Decomposition

This is my first post, I’ll do something very simple: ADVERTISE the methodology that our applied mathematicians use in data analysis. Unlike statisticians or computer scientists, we usually start from ‘dynamical system’ point of view. To give you a taste of what I mean, a method called dynamic mode decomposition (DMD) will be offered as an example.

Appetizers: why you may care?

  1. If you are a data scientist, you must be very familiar with the Singular Value Decomposition (SVD) which helps you get a better feeling of what your high-dimensional data looks like by performing Principle Component Analysis (PCA). Now I claim: DMD is a similar methodology which is also light weighted, easy to learn and straightforward to apply. Why not put it into your data science toolbox? (At least use it for a quick check!)
  2. Consider you are dealing with time series data, you don’t want to simply fit a curve (eg. polynomials) to the data and extrapolate, do you? Since you know: to  better forecast the future, you need to extract information (eg. trend, seasonality, special effects) from the historical data. Time series model, like ARIMA model, is usually used for this reason, which can explain the mechanism of your data (i.e. how the data are generated). And DMD is similar, it can give you more physical insights  compared with curve fitting.
  3. One good thing about DMD is that you don’t need to manually de-trend your time series data or explicitly specify seasonal parameter as we typically do in time series modeling, which is a great plus.

Entree: what is Dynamic Mode Decomposition?

Dynamic mode decomposition is a dimensionality reduction algorithm, which was originally introduced in the fluid mechanics community. Similar to SVD which gives you the intrinsic coordinate system and the corresponding projections, DMD offers you specific spatial modes that evolve with different temporal behaviors. The basic idea is to find the best matrix \(A\) (in the least square sense) such that \(x_{k+1}=Ax_k\). (i.e. find a linear   approximation that send you from your current state to the next.)

Robert Taylor has already written up a nice tutorial for DMD, I’ll just direct you to his blog here. But since he only tests this method on his proposed toy dataset, to serve as a supplementary, I provide an example using the real-world data, which is the motion capture data (walking) obtained from here. Also, to know more details about DMD, I recommend the paper here.

Salad: walking motion analysis with DMD

(Note: All the Matlab code below can be found in my GitHub.  🙂

Dataset

The data we are using is the first trial of CMU Mocap subject #07. We use the c3d file format, which basically means it contains the 3D coordinate data of each time frame. Sensors are placed on each part of the human body. For simplicity purposes, we only consider a subset of these sensors.

The number of sensors I pick is 18, and each sensor provides (x,y,z) information. The total number of time frames is 316 (the gap between each time frame is 1/120 seconds). So basically we have a matrix of size (54,316) and each column represents a time frame.

To give you an intuition of what the sensor data looks like, here I present nine of them to you:

Some seemingly linear trend, some periodicity, … not that bad, huh? Now, we are good to go.

DMD

If you come back from Robert Taylor’s blog on DMD, there should be no surprise that the following Matlab script is an implementation of DMD:

function [mu, Phi] = dmd(X, Y, truncate)
    if truncate == 0
        mu = [];
        Phi = [];
    else
        r = truncate;
        %
        [U,S,V] = svd(X,'econ');
        Ur = U(:,1:r);
        Sr = S(1:r,1:r);
        Vr = V(:,1:r);
        % 
        Atilde = Ur'*Y*Vr/Sr;
        [W,D] = eig(Atilde);
        mu = diag(D); % frequencies
        Phi = Y*Vr/Sr*W; % DMD modes
    end
end

Now, we apply the DMD algorithm on our data:

r = 10; % you can pick this number your self to threshold
[mu,Phi] = dmd(data(:,1:end-1),data(:,2:end),r);

the output \(\mu\) represents the temporal modes and the output \(\Phi\) represents the spatial modes.

Now, we determine the initial condition b and ‘merge’ it into temporal modes:

b = Phi\data(:,1);
for iter = 1:T
    Psi(:,iter) = b.*mu.^(iter-1);
end

then we can reconstruct our original signals by doing:

X = Phi*Psi;

Here, \(X\) is a reconstructed version of the human ‘walking’ motion. (Again, this is a quick walk-through, if any lines of explaination above don’t make sense to you, you should check Robert Taylor’s blog).

 Results

Our results show that the DMD algorithm can fit the data very well: the solid lines are ground truth and the black dotted lines are reconstructions (if you are not satisfied, you can increase the truncation parameter r to achieve better fit), meaning the motion itself can be pretty much explained by the linear system \(x_{k+1}=Ax_k\):

To get a better feeling of what we have reconstructed, here is an animation:

Now you see that the walking motion itself can be well-captured by such a linear system (I should be more careful on this, because if you generate more time frames, you may observe exponentially decay of the skeleton. This is not surprising though, because the model itself is essentially linear, it can only exhibit exponential growth, exponential decay and oscillations.).

 Interpret the Results

Then what? You can also reconstruct the walking motion with SVD! What’s so special about DMD then? Let’s look into the  spatial and temporal modes we get from the model:

Here’s a visualization of the distribution of frequencies of temporal modes (complex plane):

Note: the frequencies are not the values of \(\mu’s\), but the \(log(\mu)/dt\).

w = log(mu)/dt;
plot(w,'r*','Markersize',12); grid on; axis equal;

This is because the frequency concept comes from the continuous-time dynamical system and our process \(x_{k+1}=Ax_k\) is a discrete-time version of the dynamical system \(\frac{dx}{dt}=\tilde{A} x\). The latter one has the solution of the form \(x=\Phi(0)e^{\tilde{A}t}\), where the exponential term captures the frequencies if you consider this in the complex domain.

Now the question is, what does these temporal modes of different frequencies represent? To illustrate, I group different temporal modes using circles of different colors.

And for convenience, I also sort the modes with respect to their amplitude of frequencies.

[~, order] = sort(abs(w));
mu = mu(order);
w = w(order);
Phi = Phi(:,order);

Since the horizontal axis is the real axis and the vertical axis is the imaginary axis, the further the mode deviate from the horizontal axis, the larger the frequency. Therefore, in this plot, the coupled modes in the grey circles oscillate relatively fast, whereas the nodes inside the blue circle have almost no oscillations. And notice that the modes in the blue circle are very close to the origin, meaning they are very persistent, barely grow or decay or oscillate throughout the time, they may keep some ‘invariant’ properties of the walking motion and are very important.

Another signature from this plot is that almost all the modes are purely oscillating, except the ones in the green circles. These two modes in the green circles are on the left of the vertical axis, meaning that they exponentially decay as time proceeds. This makes us conjecture that they may not be that important since we don’t expect any sort of features vanish in a steady walking.

Our results further confirm our conjectures:

  • Conclusion 1: The modes in the blue circle corresponds to the skeleton of the body moving forward, which is the most important signature intuitively. Here’s a reconstruction with only these two modes:
X = Phi(:,1:2)*Psi(1:2,:);

To justify the importance of the modes in the blue circle, we further animate the reconstructions only with the modes in the green circles or the yellow circles:

X1 = Phi(:,3:4)*Psi(3:4,:);
X2 = Phi(:,5:6)*Psi(5:6,:);

We can see the animations don’t make sense at all: the first one starts from a mess and gradually shrinks to a point, which is in agreement with the exponential decay; the second one is doing some spider-like movement but we don’t see a human at all.

So, in order to interpret the other modes, we need to consider their combined effects with the modes in the blue circle. For convenience, let’s call them skeleton modes.

  • Conclusion 2: the modes in the green circles don’t count:

This is well-expected, because the exponential decay already tells us these two modes will vanish as time proceeds: they only contribute a bit in the beginning of the motion. (the guess is they may come from intricacies when a person ‘starts’ to walk.) All these can be justified by observing the reconstruction with the skeleton modes and the modes in the green circles:

X = Phi(:,1:2)*Psi(1:2,:)+Phi(:,3:4)*Psi(3:4,:);
  • Conclusion 3: the modes in the yellow circles correspond the the intrinsic walking frequency.

These two modes are the most interesting ones. When you add them to the skeleton modes, it provides a reasonable walking motion, which is awesome! Let’s call them intrinsic walking modes. Note that these two modes are purely oscillating, and from the above spider-like movement we can see they only capture the (nearly) periodic movements of the four limbs, which really makes sense. We can infer the walking speed from these intrinsic walking modes, and this is what SVD CANNOT DO! Let’s take a look at the reconstruction:

X = Phi(:,1:2)*Psi(1:2,:)+Phi(:,5:6)*Psi(5:6,:);
  • Conclusion 4: other modes model other high-resolution details.

Since the intrinsic walking modes are already found, other modes are of less interests. They just serve to add extra details, but no bother to take a look (in fact, they are interesting!). For example, the modes in the grey circles give us something like a ‘zombie jump‘. If we use some hindsight, this is indeed correct, huh? Here’s an animation of skeleton modes + zombie modes.

X = Phi(:,1:2)*Psi(1:2,:)+Phi(:,7:8)*Psi(7:8,:);

Dessert: what else?

Despite the success, I don’t want to lie to the readers: you cannot easily obtain such nice interpretations on more complex human motions. This is because DMD has limited power as mentioned earlier. One takeaway is: they are extremely useful in modeling the (almost) periodic signals (eg. walking, running, swimming, punching, etc.). For motions of other kinds, typically it will fail (think about jumping), let alone the mixture of these motions.

One way out is to consider multi-resolution Dynamic Mode Decomposition (mrDMD). (Again, I encourage you to read the paper or Robert Taylor’s blog on mrDMD.) However, there will be more parameter tunings when you use it, and also some tradeoffs.

2 Replies to “Extract Info from Human Walking Data with Dynamic Mode Decomposition”

  1. Hi Yuying,
    This is Daning from U of Michigan. My research involves data-driven modeling in aerospace engineering applications. I came across your website via Zhihu and found your post very interesting. I will keep an eye on your research 🙂

Leave a Reply

Your email address will not be published.