Conversational Sentiment Analysis on Audio Data | by Avi Chawla | Jul, 2022

By Jessie Hobb On Jul 20, 2022

Analyzing sentiment in Speech

Sentiment Analysis, also known as opinion mining, is a popular task in Natural Language Processing (NLP) due to its diverse industrial applications. In the context of applying NLP techniques specifically to textual data, the primary objective is to train a model that can classify a given piece of text between different sentiment classes. A high-level overview of a sentiment classifier is shown in the image below.

An overview of the Sentiment Analysis model (Image by author)

For instance, the classes for a three-class classification problem can be Positive, Negative and Neutral. An example of the three-class sentiment analysis problem is the popular Twitter Sentiment Analysis dataset which is an Entity-level sentiment analysis task on multi-lingual tweets posted by various users on Twitter.

While most of the prior research and development in NLP has primarily focused on applying sentiment analysis over text, in recent times, we have seen massive adoption and popularity of speech-based interaction tools among users, veering researchers and organizations to build sentiment classifiers in the speech space.

Therefore, this post will demonstrate how to build a sentiment analysis system on conversational data using the AssemblyAI API and Python. The end-to-end system holds extensive applicability in areas involving rigorous customer support and feedback evaluation — making it an important and valuable problem to solve, especially in the speech domain. Towards the end, I’ll also demonstrate an extensive analysis to enhance the interpretability of the obtained results and draw appropriate insights from the data.

You can find the code for this article here. Moreover, the highlight of the article is as follows:

Sentiment Analysis on Conversational Audio Data
Sentiment Analysis Results
Sentiment Analysis Insights

In this section, I am going to demonstrate the use of AssemblyAI API to classify individual sentences in a given piece of pre-recorded voice conversation into three sentiment classes: Positive, Negative and Neutral.

An overview of the Sentiment Analysis model through an API (Image by author)

Step 1: Installing Requirements

There are very few requirements to build the sentiment classifier. In terms of python libraries, we need therequests package only in Python. This can be done as follows:

pip install requests

Step 2: Generating your API Token

The next step is to create an account on the AssemblyAI website, which you can do for free. Once done, you will get your private API access key, which we will use to access the speech-to-text models.

Step 3: Uploading Audio File

For the purpose of this tutorial, I’ll use a pre-recorded audio conversation between two people to perform sentiment analysis on. Once you have obtained the API Key, you can proceed with the sentiment classification task on the pre-recorded audio file.

However, before doing that, you will need to upload the audio file so that it can be accessed via a URL. Options include uploading to an AWS S3 bucket, audio hosting services like SoundCloud or AssemblyAI’s self-hosting services, etc. I have uploaded the audio file to SoundCloud, which can be accessed below.

If you wish to upload the audio file directly to AssemblyAI’s hosting services, you can do that too. I have demonstrated this step-by-step procedure in the code blocks below.

Step 3.1: Import requirements

We start with importing the requirements for our project.

Step 3.2: Specify file location and API_Key

Next, we need to specify the location of the audio file on our local machine and the API key obtained after signing up.

Step 3.3: Specify Upload Endpoint

endpoint: This specifies the service to be invoked, which in this case is the “upload” service.
headers: This holds the API key and the content-type.

Step 3.4: Define the upload function

Audio files can only be uploaded up to a limit of 5 MBs (5,242,880 bytes) at once. Therefore, we need to upload the data in chunks. These are then merged back on the service endpoint. Hence, you don’t need to worry about handling numerous URLs.

Step 3.5: Upload

The last step is to invoke the POST request. The response of the post request is a JSON that holds the upload_url of the audio file. I will use this URL for the next steps of executing the sentiment classification on the audio.

Step 4: Sentiment Analysis

At this step, we have fulfilled all the necessary prerequisites to perform the task of sentiment analysis on the audio file. Now, we can proceed with calling the API to fetch the desired results. This is a two-step process which is demonstrated in the subsections below.

Step 4.1: Submitting Files for Transcription

The first step is to invoke an HTTP Post request. This essentially sends your audio files to the AI models running in the background for transcription and instructs them to perform sentiment analysis on the transcribed texts.

The arguments passed to the POST request are:

endpoint: It specifies the transcription service to be invoked.
json: This contains the URL to your audio file as audio_url key. As we wish to perform sentiment analysis on conversational data, the sentiment_analysis flag and speaker_labels are set to True.
headers: This holds the authorization key and the content-type.

The current status of the post request, as received in the JSON response, is queued. This indicates that the audio is currently being transcribed.

Moreover, the sentiment_analysis flag is also Truein the JSON response. However, the value corresponding to the sentiment_analysis_results key is None as the status is currently queued.

Step 4.2: Fetching the Transcription Result

To check the status of our POST request, we need to make a GET request using the id key in the JSON response received above.

Next, we can proceed with a GET request, as shown in the code block below.

The arguments passed to the GET request are:

endpoint: This specifies the service invoked and the API call identifier determined using id key.
headers: This holds your unique API key.

Here, you should know that the transcription result won’t be ready until the status key changes to completed. The time it takes for transcription depends upon how long your input audio file is. Therefore, you must make repeated GET requests at regular intervals to check transcription status. A simple way of doing this is implemented below:

Once the status changes to completed, you will receive a response similar to the one mentioned below.

Sentences in the audio file (Image by author)

To conclude, in this post, we discussed a particular NLP use case of the AssemblyAI API. Specifically, we saw how to build a sentiment classification module on a pre-recorded audio file comprising multiple speakers. Finally, we did an extensive analysis on the sentiment analysis results. The obtained results from the API highlighted the sentiments of the 22 individual sentences in the input audio file.

You can find the code for this article here.

In the upcoming posts, I will discuss more use-cases of the AssemblyAI API, such as Entity Detection, Content Moderation, and more, from both the technical and practical perspectives.

See you next time. Thanks for reading.

Analyzing sentiment in Speech

You can find the code for this article here. Moreover, the highlight of the article is as follows:

Sentiment Analysis on Conversational Audio Data
Sentiment Analysis Results
Sentiment Analysis Insights

Step 1: Installing Requirements

There are very few requirements to build the sentiment classifier. In terms of python libraries, we need therequests package only in Python. This can be done as follows:

pip install requests

Step 2: Generating your API Token

The next step is to create an account on the AssemblyAI website, which you can do for free. Once done, you will get your private API access key, which we will use to access the speech-to-text models.

Step 3: Uploading Audio File

If you wish to upload the audio file directly to AssemblyAI’s hosting services, you can do that too. I have demonstrated this step-by-step procedure in the code blocks below.

Step 3.1: Import requirements

We start with importing the requirements for our project.

Step 3.2: Specify file location and API_Key

Next, we need to specify the location of the audio file on our local machine and the API key obtained after signing up.

Step 3.3: Specify Upload Endpoint

endpoint: This specifies the service to be invoked, which in this case is the “upload” service.
headers: This holds the API key and the content-type.

Step 3.4: Define the upload function

Step 3.5: Upload

Step 4: Sentiment Analysis

Step 4.1: Submitting Files for Transcription

The arguments passed to the POST request are:

endpoint: It specifies the transcription service to be invoked.
json: This contains the URL to your audio file as audio_url key. As we wish to perform sentiment analysis on conversational data, the sentiment_analysis flag and speaker_labels are set to True.
headers: This holds the authorization key and the content-type.

The current status of the post request, as received in the JSON response, is queued. This indicates that the audio is currently being transcribed.

Moreover, the sentiment_analysis flag is also Truein the JSON response. However, the value corresponding to the sentiment_analysis_results key is None as the status is currently queued.

Step 4.2: Fetching the Transcription Result

To check the status of our POST request, we need to make a GET request using the id key in the JSON response received above.

Next, we can proceed with a GET request, as shown in the code block below.

The arguments passed to the GET request are:

endpoint: This specifies the service invoked and the API call identifier determined using id key.
headers: This holds your unique API key.

Once the status changes to completed, you will receive a response similar to the one mentioned below.

You can find the code for this article here.

In the upcoming posts, I will discuss more use-cases of the AssemblyAI API, such as Entity Detection, Content Moderation, and more, from both the technical and practical perspectives.

See you next time. Thanks for reading.

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Conversational Sentiment Analysis on Audio Data | by Avi Chawla | Jul, 2022

Analyzing sentiment in Speech

Step 1: Installing Requirements

Step 2: Generating your API Token

Step 3: Uploading Audio File

Step 3.1: Import requirements

Step 3.2: Specify file location and API_Key

Step 3.3: Specify Upload Endpoint

Step 3.4: Define the upload function

Step 3.5: Upload

Step 4: Sentiment Analysis

Step 4.1: Submitting Files for Transcription

Step 4.2: Fetching the Transcription Result

#1 Speaker distribution

#2 Speaker Duration distribution

#3 Sentiment Distribution

#4 Sentiment Distribution on Speaker-level

#5 Average Sentence Duration on Sentiment-level

Analyzing sentiment in Speech

Step 1: Installing Requirements

Step 2: Generating your API Token

Step 3: Uploading Audio File

Step 3.1: Import requirements

Step 3.2: Specify file location and API_Key

Step 3.3: Specify Upload Endpoint

Step 3.4: Define the upload function

Step 3.5: Upload

Step 4: Sentiment Analysis

Step 4.1: Submitting Files for Transcription

Step 4.2: Fetching the Transcription Result

#1 Speaker distribution

#2 Speaker Duration distribution

#3 Sentiment Distribution

#4 Sentiment Distribution on Speaker-level

#5 Average Sentence Duration on Sentiment-level