Design of a neural network for the analysis of the trajectory of people

Hello everyone,

I am taking my first steps in this interesting world of neural networks. I am currently involved in an interesting project where I need to obtain some previous results, where I think that a neural network would fit very well. However, I am a bit lost. That’s why I need some advice and/or some help. I’ll tell you about what we want to do to give you some context.

The project itself is framed within the theme of wellbeing in buildings. Specifically in the area of the occupancy level of the different rooms in a building. For this purpose, a multitude of devices have been deployed in the building (presence sensors, cameras, pressure sensors on chairs, electricity consumption meters, etc.) whose information will be, after being properly processed, combined and interpreted through symbolic knowledge.

One of the devices that has been deployed, the source of this post, is a Kinect. Through Kinect and its SDK, we store the coordinates of the detected users’ joints and their orientation. The main purpose of the Kinect is to allow us to detect when someone leaves one room and enters another, identifying the rooms of origin and destination. Below I show a figure from the perspective of the Kinect.

In the following code snippet I show an example of data stored in JSON. In this way, every time a user changes room, we will obtain a set of frames composed by the position (reference system in the Kinect) of the SPIN_BASE joint (taken as a reference in this work) and its orientation. We consider that tracking that single coordinate is enough, allowing us to reduce the necessary computation.

{
  "skeleton_id": "id_123456789",
  "frames": [
    {
      "timestamp": VALUE,
      "point": {
        "x": VALUE,
        "y": VALUE,
        "z": VALUE
      },
      "orientation": {
        "x": VALUE,
        "y": VALUE,
        "z": VALUE,
        "w": VALUE
      }
    },

    ...

    {
      "timestamp": VALUE,
      "point": {
        "x": VALUE,
        "y": VALUE,
        "z": VALUE
      },
      "orientation": {
        "x": VALUE,
        "y": VALUE,
        "z": VALUE,
        "w": VALUE
      }
    }
  ]
}

My difficulties come when designing a neural network that is able to learn patterns to identify the source and destination rooms of users through these sets of (points, orientations). I have spent several weeks documenting myself and I have drawn some conclusions:

  • The order of the data in this case is of great importance, so the best option is an RNN.

  • I assume that the data should be pre-processed to make it more representative. Apart from the usual (such as normalisation), I think it should be processed in some way - perhaps represent the sets of points as trajectories?

Due to my limited experience, I am lost in network design and data preparation. So any help or advice would be very welcome.

HI,

This is what I’d do:

  • create a model that given a sequence of positions, predicts the next (x,y) position based on historical data.
  • make this model generate a maybe 10 points ahead and use this generated path decide to which room the person is going.

the part about deciding the room I’d do in regular code for now (is (x,y) inside room 1?)
to do that I’d try using RNN (or LSTM to be more precise) and look into how to predict any time series (eg: weather). My gut suggest that they are similar problems.

For example, here’s a video of a friend doing some basics on trajectory prediction: Shooting Hoops with Keras and TensorFlow || Zack Akil - YouTube
(it’s old but it’s just to get the idea)

I hope this can give you some ideas.

3 Likes

I will suggest also to take a look at:

And more in general to some adopted solution, available implementations, and benchmark in trajectory prediction at:

If your cameras are fully calibrated and you have a 2d perimeter representation of the rooms you can project the position in the 2d plane on the floor and you can make a simply point in polygon test to check if a person is inside a specific room perimeter.

2 Likes

First of all, thank you very much for your time and help.

This is a very good idea. One of the advantages of this approach is that it is not necessary to perform a labeling process of the timeseries in the dataset. Given my limited experience, I have been looking for an example that I can take as a reference . However, I would have to take into account the following aspects:

1.) Find an elegant way to associate the estimated points to one of the rooms.

2.) This solution allows to estimate points of the trajectory to identify the room through which the person exits. However, I would need to find a way to identify the exit room (origin of the trajectory). I have thought of reversing the sequence of points and estimating the trajectory, with the difference that this time it defines the exit room and not the entrance room.

The other strategy I had thought of is something different. My idea was to label each timeserie with a pair (source room, destination room), normalize the points and orientations and have the network try to find a pattern: “When the first points are near ZONE X and the last points are near ZONE Y, then the classification should be (ZONE X, ZONE Y)”. What do you think?

I apologize for my English, I know it should be much better.

So, at a step zero I wouldn’t care for rooms on the Model and calculate it with classical programming.
Just the trajectory on ML

1 Like