Inverse kinematic approximation with neural network

Aristide_Martello · September 14, 2021, 3:44pm

Good morning everyone,
I’ll try to briefly explain the context and then the problem I’m facing:

Context: I am using and testing a collaborative Robot. This Robot has been provided to me with a library in python that allows to acquire signals from the robot (currents, velocity, positions, I/O etc) and to command it, in joint and end-effector (EE) coordinates. There are also available the functions of Direct and Inverse Kinematics (DK and IK).
For my curiosity, I was interested in generating a trajectory (in end-effector coordinates) in order to move it within a conical area [I attach a link to the video that shows the movement in question].

LINK: https://www.youtube.com/watch?v=CExtMfvRabo

From the robot, moreover, it is possible to save the .csv file containing the trajectories, in joint coordinates, of the single joints.

Initially, not knowing the “shape” that should have the trajectory (in end-effector coordinates) of the movement that I was interested in reproducing, I was able, manually moving the robot in gravity compensation mode, to acquire the trajectories of the individual joints. At this point, using the Direct Kinematics algorithm, I obtained the movement of the consequent end-effector [I attach photos of 2 3D graphs: the first in which I plot the 3 coordinates x,y,z and the second, in which I plot roll, pitch, yaw].

End Effector Angular Displacement

End Effector Position Displacement

Here the problem was born.

Problem: out of curiosity, I tried to use the Inverse Kinematics algorithm on the points obtained from the DK and the algorithm returned the error: “Singular Trajectory”. But the robot was able to move according to that trajectory, the problem should be in the calculation of the IK, which probably finds multiple/infinite solutions.
To overcome this limitation I used a Neural Network developed in Python using Tensorflow (Keras) to try to approximate the IK. I will preface this by saying that I have never used Keras or Tensorflow, so I may have made some conceptual errors. I have consulted the API of Keras and also the guide proposed in this link

LINK: https://machinelearningmastery.com/deep-learning-models-for-multi-output-regression/

In my PC I use:

Visual Studio Code for programming in python;
python 3.9.5
Keras 2.6.0;

I thought of the neural network this way: 6 input nodes (corresponding to the 6 coordinates of the end-effector) and 6 output nodes (the 6 coordinates of the joints). The training set consists of a .csv file containing the 6 coordinates of the end-effector computed via the DK run on a .csv file containing the trajectories of the 6 joints. The file containing the joint coordinates is the Label file.
Below I attach the code of the network implementation.

from numpy import loadtxt
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf
from numpy import array


# Model definition
def get_model(I_N_L_1, I_N_L_2, I_N_L_3, I_N_L_4, I_N_L_5 ,n_inputs, n_outputs):

	model = Sequential()
	model.add(Dense(I_N_L_1, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
	model.add(Dense(I_N_L_2, activation='relu'))
	model.add(Dense(I_N_L_3, activation='relu'))
	model.add(Dense(I_N_L_4, activation='relu'))
	model.add(Dense(I_N_L_5, activation='relu'))
	model.add(Dense(n_outputs))    
	model.compile(loss='mae', optimizer='adam', metrics=["mae"])
	return model


# Load Training set csv
dataset_EF = loadtxt('WeldingProve.csv', delimiter=',')
x_train = dataset_EF[0:1700,0:6]
print('shape: ',x_train.shape)
	
# Load Label set csv
dataset_joints = loadtxt('EF_from_WeldingProve.csv', delimiter=',')
y_train = dataset_joints[0:1700,0:6]
print('shape: ',y_train.shape)

# Test set definition
x_test = dataset_EF[1701:,0:6]
print('shape: ',x_test.shape)
	
# Label of the test set definition
y_test = dataset_joints[1701:,0:6]
print('shape: ',y_test.shape)

# Number of nodes in the hidden layers
I_N_L_1 = 192
I_N_L_2 = 36
I_N_L_3 = 6
I_N_L_4 = 36
I_N_L_5 = 192

# Number of nodes in the input and output layers
n_inputs = 6
n_outputs = 6


# calling the "get_model" function
model = get_model(I_N_L_1, I_N_L_2, I_N_L_3, I_N_L_4, I_N_L_5 ,n_inputs, n_outputs)
es = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)

# fit model
model.fit(x_train, y_train, verbose=1, epochs=600) 

# saving the model
model.save("Test_Model.h5")

# Testing procedure
pred = []

# Computing the Prediction on the Test Set
for i in range(len(x_test)-1):
	b = [x_test[i][0], x_test[i][1], x_test[i][2], x_test[i][3], x_test[i][4], x_test[i][5]]
	ToBePredicted = array([b])
	Prediction = model.predict(ToBePredicted)
	a = [Prediction[0][0], Prediction[0][1], Prediction[0][2], Prediction[0][3], Prediction[0][4], Prediction[0][5]]

# Computing the mean vector of the error for each predicted joint trajectory
average_vector = []
sum = 0
average = 0
for j in range(6): # colonne
	for i in range(len(y_test)-1): #righe
		sum = sum + (pred[i][j] - y_test[i][j])
	
	average = sum/len(y_test)
	average_vector.append(average)

	average = 0
	sum = 0

print('average_vector: ', average_vector)

# Computing the standard deviation vector of the error for each predicted joint trajectory
sum = 0
std_vector = []
for j in range(6): # colonne
	for i in range(len(y_test)-1): #righe
		sum = sum + ((pred[i][j] - y_test[i][j]) - average_vector[j])**2
	
	std = (sum/len(y_test))**(0.5)
	std_vector.append(std)

	std = 0
	sum = 0

print('std_vector: ', std_vector)

My questions are the following:

once I have trained the neural network, even using a very large training set, I get predictions that are not good. Can you suggest me how to improve these predictions, perhaps going to act on the parameters of the network,
Is it necessary to pre-process the training data and its labels? If yes, which technique should I apply?
Trying to change the number of nodes in the various layers of the network, I saw that the performance changes, even a lot. Do you have advice on the “shape” to give to the network ?
Are there any other solutions that can be used to estimate the IK of the robot ?

Ekaterina_Dranitsyna · September 15, 2021, 9:10am

I’m not an expert in robotics. So my comments are only regarding the model architecture and training.

When you call model.fit() you can pass you test data to the argument “validation_data”, and the model will automatically calculate loss and metrics for both train and validation set. TensorFlow has MeanSquaredError and MeanAbsolutePercentageErrror in addition to MAE that you use.
In the EarlyStopping callback you should define monitor=‘val_loss’ and restore_best_weights=True. In this case the training will be stopped, when validation loss starts worsening, and the model will roll back to the optimal state, when best val_loss was reached. At present you monitor training loss, which does not say anything about overfitting.
Check the scale of the coordinates used as input features. If they are not in range 0-1, input data requires normalization. Keras has Normalization layer, which could be used to ensure that all data passed to the model is normalized identically.
Usually number of units in the dense layers gradually decreases. You defined 5 layers with units decreasing and then increasing like V-shape.
If all this does not improve the result, probably you should add more features like previous positions of the object, or it’s speed, or something else.

Aristide_Martello · September 15, 2021, 9:31am

I really thank you very much for these tips. It is my interest to implement them as soon as possible and see if I can get any improvements.

Thanks again for your kindness and availability.

Bhack · September 15, 2021, 11:47am

I think you can also Explore a RL approach.

This is large scale library so probably it doesn’t for your use case:

But as it is suggested in the Readme you could try to to look to Dopamine of tf-agents repos.

I also suggest to take a look at:

Aristide_Martello · September 16, 2021, 3:52pm

I think I have implemented the changes you suggested and below I attach the updated code.
From the first results I’ve seen that the prediction has improved dramatically, but I need some clarification regarding some parameters and functions in use in the neural network.
Note: I have a dataset of 16400 samples (.csv file consisting of a matrix 16400x6)

from numpy import loadtxt
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense
from numpy.core.einsumfunc import einsum
import tensorflow as tf
from keras.layers import LayerNormalization
from numpy import array


# Model definition
def get_model(I_N_L_1, I_N_L_2, I_N_L_3, I_N_L_4, I_N_L_5, I_N_L_6, n_inputs, n_outputs):

	model = Sequential()
	model.add(Dense(I_N_L_1, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
	model.add(LayerNormalization(axis=-1 , center=True , scale=True))
	model.add(Dense(I_N_L_2, activation='relu'))
	model.add(Dense(I_N_L_3, activation='relu'))
	model.add(Dense(I_N_L_4, activation='relu'))
	model.add(Dense(I_N_L_5, activation='relu'))
	model.add(Dense(I_N_L_6, activation='relu'))
	model.add(Dense(n_outputs))    
	model.compile(loss='mae', optimizer='adam', metrics=["mae"])
	return model


print('start reading CSV')
# Load Training set csv
dataset_EF = loadtxt('EF_from_Welding2.csv', delimiter=',')
x_train = dataset_EF[0:12000,0:6]
print('shape: ',x_train.shape)

# Load Label set csv
dataset_joints = loadtxt('Welding2.csv', delimiter=',')
y_train = dataset_joints[0:12000,0:6]
print('shape: ',y_train.shape)

# Validation set definition
x_val = dataset_EF[12001:14000,0:6]
print('shape: ',x_val.shape)

# Label of the validation set definition
y_val = dataset_joints[12001:14000,0:6]
print('shape: ',y_val.shape)


# Test set definition
x_test = dataset_EF[14001:,0:6]
print('shape: ',x_test.shape)
	
# Label of the test set definition
y_test = dataset_joints[14001:,0:6]
print('shape: ',y_test.shape)


print('end reading CSV')

# Number of nodes in the hidden layers
I_N_L_1 = 700
I_N_L_2 = 450
I_N_L_3 = 300
I_N_L_4 = 150
I_N_L_5 = 75
I_N_L_6 = 15

# Number of nodes in the input and output layers
n_inputs = 6
n_outputs = 6

print('start model and training')
# calling the "get_model" function
model = get_model(I_N_L_1, I_N_L_2, I_N_L_3, I_N_L_4, I_N_L_5, I_N_L_6, n_inputs, n_outputs)

# fit model
es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, mode='min', restore_best_weights=True, verbose=1)
model.fit(x_train, y_train, validation_data=(x_val, y_val), verbose=1, epochs=100, callbacks=[es]) 
print('end model and training')

print('start saving model')
# saving the model
model.save("ModelloDiProva1.h5")
print('end saving model')


print('start predictions')
# Testing procedure
pred = []

# Computing the Prediction on the Test Set
for i in range(len(x_test)-1):
	b = [x_test[i][0], x_test[i][1], x_test[i][2], x_test[i][3], x_test[i][4], x_test[i][5]]
	ToBePredicted = array([b])
	Prediction = model.predict(ToBePredicted)
	a = [Prediction[0][0], Prediction[0][1], Prediction[0][2], Prediction[0][3], Prediction[0][4], Prediction[0][5]]
	pred.append(a)

print('end predictions')

print('start validation')
# Computing the mean vector of the error for each predicted joint trajectory
average_vector = []
sum = 0
average = 0
for j in range(6): # column
	for i in range(len(y_test)-1): #row
	
		sum = sum + (pred[i][j] - y_test[i][j])
	
	average = sum/len(y_test)
	average_vector.append(average)

	average = 0
	sum = 0

print('average_vector: ', average_vector)

# Computing the standard deviation vector of the error for each predicted joint trajectory
sum = 0
std_vector = []
for j in range(6): # colonne
	for i in range(len(y_test)-1): #righe
		sum = sum + ((pred[i][j] - y_test[i][j]) - average_vector[j])**2
	
	std = (sum/len(y_test))**(0.5)
	std_vector.append(std)

	std = 0
	sum = 0

print('std_vector: ', std_vector)
print('end validation')

as per your advice I have changed the “shape” of the neural network, starting with an initial layer of 700 nodes to arrive at the penultimate layer with 15 nodes. Do you have any advice or rule of thumb to use on the number of layers and the number of nodes per layer, in relation to the position of the layer within the network?
I have divided my dataset in the following way: 12000 values for the training set, 2000 values for the validation set and the remaining values for the test set. Is this a sensible choice or is it better to have all 3 sets with the same size?
Until now, each layer has the activation function “relu”. I have tried to use other activation functions, getting less precise results; in several examples on the internet I see that, for example, in the output layer a different activation function is used than in the previous layers. Is there a way to choose the best activation function based on the problem you are using? Why is a different activation function used in the output layer than in the other layers?
always searching between the examples in internet I have seen that the function “DropOut()” is used. I have understood that it is used to avoid the overfitting of the network by acting randomly on the weights stored in a particular layer of the network. Could it be useful to insert it also in my network? If yes, is it necessary to insert it between two specific layers or is it necessary to “go by attempts”?

Relatively to the normalization of the input, I have used the function “LayerNormalization”:

is it necessary to insert it only once, or in multiple layers of the network?
There is also a normalization function called “BatchNormalization”, but I could not understand the difference between the first and the second.

Thanks in advance for your attention

Ekaterina_Dranitsyna · September 17, 2021, 7:42am

Hi! I’m glad you got some positive results.
When I wrote about normalization of input data, I meant preprocessing.Normalization layer. You can find an example in this tutorial: Classify structured data using Keras preprocessing layers | TensorFlow Core
This layer should be initialized and adapted to the training subset of your data. Then it should be used inside a model as the first layer or as a second layer following layer Input.
Training and validation sets should not be of equal size. Using 10% to 15% of the data for validation is normal, especially if you have a very small data set.
Before using any techniques to prevent overfitting you need to find out if the model actually overfits. For that you should plot train and validation loss and inspect the chart. Here is a tutorial on this subject: Переобучение и недообучивание | TensorFlow Core
If you use Dropout layers they are added after all or some of the inner dense layers (not after the final dense layer).
As for activations, they depend on the position of the dense layer and the task. In the inner dense layers you can use “relu”, “elu” or “selu”. In the final layer for regression task you do not specify activation, which is equivalent to “linear” activation. If you had a classification task, the final layer would have “softmax” or “sigmoid” activation depending of the number of classes.
The optimal architecture of the network is a result of trial and error. You can experiment manually or use KerasTuner to explore parameter combinations automatically.

Hi_n_Nguy_n_Van · September 18, 2021, 1:27am

@Aristide_Martello
hi you, You can share WeldingProve.csv and EF_from_WeldingProve.csv
thanks

Aristide_Martello · September 21, 2021, 9:24am

Good morning,
sorry for the late reply. I have problem in sharing the files. Can I send you via Email ?

Hi_n_Nguy_n_Van · September 21, 2021, 12:59pm

My email: Removed by moderator
Thank you very much!

Rizka_Ardiantika · February 13, 2023, 8:53am

hi miss, You can share WeldingProve.csv and EF_from_WeldingProve.csv
thanks. thats very help me…

Rizka_Ardiantika · February 22, 2023, 12:18am

hello mr, can i find the same file of EF_from_WeldingProve.csv

Rizka_Ardiantika · May 1, 2023, 1:16am

of course. this is my email Removed by moderator