How to improve classification prediction models accuracy?

asiddiq · June 23, 2021, 11:11am

Hi,
I have applied several classification methods, unfortunately, the developed models never exceed 62% of accuracy.

here I attached a comparison table of the developed models.
I’m wondering how I can improve the models’ accuracy!?

Bhack · June 23, 2021, 4:56pm

The first question i suggest to you is:

what are your performances on the training set?

asiddiq · June 23, 2021, 11:16pm

I have split the data into training and testing and the confusion matrix give these results, not sure if is it the right thing

from sklearn.tree import DecisionTreeClassifier
dt=DecisionTreeClassifier()

dt.fit(X_train,y_train)

pred_dt_tr=dt.predict(X_train)

pred_dt=dt.predict(X_test)

from sklearn.metrics import confusion_matrix,classification_report,f1_score

print(confusion_matrix(y_test,pred_dt))

print(classification_report(y_test,pred_dt))

Bhack · June 23, 2021, 11:28pm

What I meant, I suppose that the table is from the testset.

So what are the models performances on the training set?

This could be a useful starting point to understand if:

You still have a margin to learn with the current data
You have generalization issues or you model is overfitting
Your model capacity is limited
Missing hyperparameter tuning on a validation set
Etc.

asiddiq · June 23, 2021, 11:30pm

thank you for replying
How can I check the training performance?

asiddiq · June 23, 2021, 11:43pm

I just see the accuracy for training and resting for KNN. the accuracy for training is 0.996 and the testing is 0.722

Bhack · June 23, 2021, 11:44pm

Using pred_dt_tr and y_train

This forum is generally about Tensorflow but you are using sklearn so I suggest you to use sklearn support channel for sklearn code/projects.

I don’t know your specific learning goal and dataset but in TF you can try to explore:

asiddiq · June 23, 2021, 11:47pm

Sorry I apologies if posted something not related to tensorflow policy.

Thank you for replying to me

Bhack · June 24, 2021, 12:05am

No prob. Let us know If you have other questions experimenting TF.

Anel_Music · June 24, 2021, 12:29am

It is almost impossible to suggest anything without additional information.

How many classes are you trying to predict?
How many training images do you have?
Does your dataset suffer from data imbalance?
Have you already checked your data? (e.g. if it is labeled correctly)

asiddiq · June 24, 2021, 1:30am

I’m trying confirmed and suspected cases
it is numerical data not images
How can I check for data balance?
the data is labeled

Sergii_Kavun · June 24, 2021, 10:09am

It would be great that you will show two graphics: acc & loss, as a picture …they can be drawn on one picture. It has given some answers to the appeared questions.

asiddiq · June 25, 2021, 10:47am

unfortunately I only applied Confusion Metrix so I didn’t apply anything else