Unequal Distribution of Training Diastolic Blood Pressure Labels

My task is to estimate diastolic blood pressure (DBP) from PPG and ECG data. However the DBP labels are distributed unequally.

I am using Diastolic blood pressure as training labels. The range that I am considering is 16mmHg - 276mmHg, however, majority samples lie in the range of 45mmHg - 110mmHg, because of this model is learning this range efficiently, and is not predicting the values outside this range. What can be the solution to this problem?


You can try to upsample the minority class and include more samples if possible.

Another way is to use a weighted loss function by providing more weightage to minority class during training.

You can find more information about weighted loss here Imbalanced classification: credit card fraud detection

Thank you!

Thanks for your prompt response. I tried the weighted loss function as follows;

def get_weight(label):

if label >= 16 and label <= 45:

return 2.0

elif label >= 110 and label <= 276:

return 1.5


return 1.0 # Assign the default weight to other samples

def weighted_loss(y_true, y_pred):

weights = tf.map_fn(lambda x: get_weight(x), y_true)

loss = tf.losses.mean_absolute_error(y_true, y_pred)

weighted_loss = tf.reduce_mean(loss * weights)

return weighted_loss

but that did not work for me.
Can you please the upsampling the minority DBP range a little for me?


Please refer to SMOTE upsampling technique which may help you.

Thank you!