Model for sentiment analysis in Nigerian (and other) languages

Callum_Matthews · September 13, 2021, 5:02pm

Hi. Very new to this forum and more to Tensorflow, and I had a quick question that I hope someone can help me in.

Basically, I’m working with a set of languages (hausa, yoruba and igbo) that do not have a reliable sentiment analysis model to process text with - unless I missed something. What I want is to create a custom model for each of these languages where the model scores and returns the sentiment of a sentence as accurately as possible.

I’m not sure how to approach this. What I did first is got a training dataset where the text and its sentiment (human scored), vectorized the text and created a model (using an Embedding layer). The accuracy wasn’t the best but I don’t know if that is the way to continue. Selecting the right hyperparameters seems like a separate job on its own.

Can anyone recommend on how you might approach this? And if there’s documentation on how a sentiment analysis model using these languages (or any non-English) language is created?

Any help would be appreciated. Thanks.

Bhack · September 13, 2021, 5:53pm

Hi, welcome to the forum, Is this for

https://lacunafund.org/language-2020-awards/

Callum_Matthews · September 13, 2021, 5:57pm

Hello.

No, this is an internal company project (for now). We are trying to perform sentiment analysis on the Nigerian languages without translating to English but haven’t found anything of note.

Bhack · September 13, 2021, 6:11pm

I suggest you to contact this group:

It would be really nice if you will contribute a dataset on these low resources launguage to our datasets collection at:

More in general you could start to explore something like: