In today’s data-driven world, user feedback plays a pivotal role in shaping businesses and services. Feedback, whether positive, negative, or neutral, provides businesses with critical insights into customer satisfaction, product quality, and areas for improvement. It serves as the voice of the customer, helping organizations tailor their offerings, improve service delivery, and stay competitive in an ever-changing market landscape.
Sentiment analysis takes feedback a step further by determining the emotional tone behind words. Whether it’s a delighted customer praising a product or a frustrated user pointing out an issue, sentiment analysis provides a clear, quantifiable measure of customer emotions. This insight helps businesses gauge their overall customer experience, identify trends, and prioritize actions to address concerns.
Amazon Comprehend, a powerful natural language processing (NLP) service by AWS, simplifies the implementation of sentiment analysis. It uses machine learning to uncover valuable information in text, such as sentiment, key phrases, and entities, without requiring deep technical expertise in AI. With Amazon Comprehend, businesses can quickly process and analyze large volumes of feedback data with high accuracy.
Automation ensures that the process of analyzing user feedback is seamless, efficient, and scalable. By automating the sentiment analysis pipeline, businesses can save time and focus on acting upon the results rather than manually analyzing data.
In this blog, we’ll demonstrate how to combine these concepts into a practical use case: automated sentiment analysis of Amazon food customer review data. By leveraging AWS Lambda, Amazon Comprehend, and S3, we’ll build a pipeline that processes customer reviews, analyzes sentiment, and stores the results for further analysis. This solution offers businesses an efficient way to turn raw feedback into actionable insights. Let’s get started!
Architecture Overview
Here’s the architecture of our use case:
- Input Storage: Customer feedback is uploaded as CSV files to the input/ folder of an Amazon S3 bucket.
- Lambda Trigger: The S3 bucket is configured to trigger an AWS Lambda function whenever a new CSV file is uploaded.
- Sentiment Analysis: The Lambda function reads the feedback data, analyzes the sentiment using Amazon Comprehend, and organizes the results.
- Output Storage: The analyzed results are saved as a JSON file in the output/ folder of the same S3 bucket for further use.
Prerequisites
Before you begin you need:
- An aws account with a full administrative privileges
- Python runtime environment
- AWS Command Line Interface (CLI)
Step 1: Set Up S3 to store to store user reviews:
- Create an S3 Bucket:
- Go to the AWS Management Console and search for S3
- Click Create Bucket.
- Give the bucket a name ‘users-feedback-analysis-bucket’.
- Uncheck “Block all public access” if you need external access (optional).
- Check “I acknowledge that the current settings might result in this bucket and the objects within becoming public” option.
- Leave other settings as default and click Create bucket.
- Create an Input Folder:
- Inside the S3 bucket, create a folder named input/ to store raw CSV files containing feedback.
- This is where you will upload the policy documents.
- Create an Output Folder:
- Create another folder named output/.
- Summarized results will be saved here if needed.
- Upload Sample Documents:
- Use the Upload button to upload your .txt policy files into the input/ folder.
Step 2: Create IAM Role for Lambda
Before writing the code, create an IAM role for the Lambda function with the following permissions:
- Go to IAM from the AWS Console
- From the navigation panel on the left select Roles
- Click ‘Create Role’ button.
- Select Lambda as the trusted entity.
- Search and attach the following policies:
- AmazonS3FullAccess (Allows the Lambda function to read/write to the S3 bucket)
- ComprehendFullAccess (Allows the Lambda function to access Amazon Comprehend)
- AWSLambdaBasicExecutionRole (Allows basic Lambda Execution)
- I will name the role ‘lambda-comprehend-sentiment-analysis’ and save.
Screenshot of IAM Role Creation
Step 3: Create Lambda Function
Now, create the Lambda function.
- From the AWS Console select AWS Lambda
- Click ‘Create Function’.
- Select ‘Author from scratch’.
- I will name our Lambda function ‘policy-document-summarizer’.
- Select Runtime: Python 3.9 or newer.
- On the ‘Change default execution role’ section select ‘use an existing role’ option and Attach the role ‘lambda-comprehend-sentiment-analysis’ we created in step 2.
Screenshot of Lambda function creation
- Next, write a Python Function that fetches feedback from S3 bucket, calls Amazon Comprehend and save the output back in S3:
Here is the Python code:
import boto3
import csv
import json
import logging
s3 = boto3.client("s3")
comprehend = boto3.client("comprehend")
def analyze_sentiment(text):
try:
response = comprehend.detect_sentiment(Text=text, LanguageCode="en")
sentiment = response.get("Sentiment", "UNKNOWN")
return sentiment
except Exception as e:
raise RuntimeError(f"Error in sentiment analysis: {str(e)}")
def lambda_handler(event):
try:
records = event.get("Records", [])
if not records:
raise ValueError("No records found in the event.")
s3_event = records[0]
bucket_name = s3_event["s3"]["bucket"]["name"]
object_key = s3_event["s3"]["object"]["key"]
response = s3.get_object(Bucket=bucket_name, Key=object_key)
csv_content = response["Body"].read().decode("utf-8")
csv_reader = csv.DictReader(csv_content.splitlines())
analyzed_feedback = []
for row in csv_reader:
feedback_text = row.get("feedback", "")
sentiment = analyze_sentiment(feedback_text)
analyzed_feedback.append({"feedback": feedback_text, "sentiment": sentiment})
output_key = f"output/{object_key.split('/')[-1].replace('.csv', '_analyzed.json')}"
s3.put_object(
Bucket=bucket_name,
Key=output_key,
Body=json.dumps(analyzed_feedback, indent=4),
ContentType="application/json"
)
return {
"statusCode": 200,
"body": {
"message": f"Analyzed feedback saved to s3://{bucket_name}/{output_key}",
"output_key": output_key
},
}
except Exception as e:
return {
"statusCode": 500,
"body": f"Error: {str(e)}"
}
Let’s break down the key code blocks:
Imports:
import boto3
import csv
import json
Here, the code imports the boto3 library, which is used to interact with AWS services, csv and json libraries, which are standard Python libraries used to handle CSV files and JSON data formats.
initialize a client for Amazon S3
s3 = boto3.client("s3")
This line initializes a client for Amazon S3 using the boto3 library. Amazon S3 (Simple Storage Service) is a cloud storage service used to store and retrieve files. The client enables the Lambda function to interact with S3, such as downloading files, uploading processed data, or listing objects in a bucket.
initialize a client for Amazon Comprehend
comprehend = boto3.client("comprehend")
This initializes a client for Amazon Comprehend, which is AWS’s natural language processing (NLP) service. The ‘comprehend’ client is used to call various Comprehend APIs, such as detecting sentiment, key phrases, or entities in a given text. In this function, it allows the program to analyze the sentiment of user feedback.
Sentiment Analysis Function:
def analyze_sentiment(text):
try:
response = comprehend.detect_sentiment(Text=text, LanguageCode="en")
sentiment = response.get("Sentiment", "UNKNOWN")
return sentiment
except Exception as e:
raise RuntimeError(f"Error in sentiment analysis: {str(e)}")
The ‘analyze_sentiment’ function is a vital part of this application, designed to determine the sentiment of a given piece of text using Amazon Comprehend’s powerful NLP capabilities.
response = comprehend.detect_sentiment(Text=text, LanguageCode=”en”): This line sends the provided text to Amazon Comprehend for sentiment analysis, specifying the language as English (en).
sentiment = response.get(“Sentiment”, “UNKNOWN”): Retrieves the detected sentiment from the response. If the sentiment is not found, it defaults to “UNKNOWN”.
return sentiment: Returns the detected sentiment, such as “Positive”, “Negative”, “Neutral”, or “Mixed”.
Exception Handling: If an error occurs during the API call or response parsing, an exception is raised with a clear error message using raise RuntimeError.
Explanation of the lambda_handler Function:
def lambda_handler(event):
defines the entry point of the AWS Lambda function. It is a required naming convention for AWS Lambda, and it serves as the function that AWS automatically invokes when the Lambda function is triggered by an event.
‘Lambda_handler’ function takes the ‘event’ parameter. The ‘event’ Parameter contains all the information about the triggering event. For example, in this project, it holds details about the uploaded file in S3, such as the bucket name and object key. The event object is typically in JSON format and varies depending on the service that triggers the Lambda function (e.g., S3).
records = event.get("Records", [])
if not records:
raise ValueError("No records found in the event.")
This part extracts the ‘Records’ key from the event object, which is triggered by an S3 file upload. If no records are found, an exception is raised because there is no file to process. This ensures the function handles invalid or incomplete event data gracefully.
s3_event = records[0]
bucket_name = s3_event["s3"]["bucket"]["name"]
object_key = s3_event["s3"]["object"]["key"]
These lines extract the S3 bucket name (bucket_name) and file key (object_key) from the first record in the Records list. These values identify the location of the uploaded file in S3. It assumes that only one file is uploaded at a time, simplifying the logic.
response = s3.get_object(Bucket=bucket_name, Key=object_key)
csv_content = response["Body"].read().decode("utf-8")
csv_reader = csv.DictReader(csv_content.splitlines())
Here, the function retrieves the CSV file from S3 using the bucket name and object key. The file content is read and decoded as a UTF-8 string. Then, it uses Python’s ‘csv.DictReader’ to parse the CSV file into a dictionary-like structure, making it easy to access individual rows by column names.
analyzed_feedback = []
for row in csv_reader:
feedback_text = row.get("feedback", "")
sentiment = analyze_sentiment(feedback_text)
analyzed_feedback.append({"feedback": feedback_text, "sentiment": sentiment})
This loop processes each row in the CSV file:
- It retrieves the feedback text from the ‘feedback’ column. If the column is missing or empty, it defaults to an empty string.
- The feedback text is passed to the ‘analyze_sentimen’t function, which uses Amazon Comprehend to determine the sentiment (e.g., Positive, Negative, Neutral).
- The function appends the feedback text and its detected sentiment to the ‘analyzed_feedback’ list, preparing the data for storage.
output_key = f"output/{object_key.split('/')[-1].replace('.csv', '_analyzed.json')}"
s3.put_object(
Bucket=bucket_name,
Key=output_key,
Body=json.dumps(analyzed_feedback, indent=4),
ContentType="application/json"
)
This code snippet saves the analyzed sentiment results to a new JSON file in the ‘output’ folder of the same S3 bucket. The ‘output_key’ ensures the new file has a name based on the original file name, with the .csv extension replaced by ‘_analyzed.json’. The data is formatted as JSON and uploaded to S3 using the ‘put_object’ method.
return {
"statusCode": 200,
"body": {
"message": f"Analyzed feedback saved to s3://{bucket_name}/{output_key}",
"output_key": output_key
},
}
If all steps succeed, the function returns a success response containing the HTTP status code 200 and a message indicating where the analyzed feedback was saved in S3.
except Exception as e:
return {
"statusCode": 500,
"body": f"Error: {str(e)}"
}
This is the error-handling block. If any exception occurs during execution, the function catches it and returns an HTTP 500 status code with a detailed error message, allowing for easier debugging and monitoring.
Step 4: Attach S3 Trigger to Lambda Function
- Under the ‘Function overview’ section of ‘SentimentAnalysisLambda’ function, click ‘Add trigger’.
- Choose S3 as the trigger source.
- Select your S3 bucket name (in this case, users-feedback-analysis-bucket).
- Under ‘Event type’, select ‘All object create events’.
- Under Prefix, enter input/. (This ensures the trigger only fires for objects in the input/ folder.)
- Leave Suffix blank, or enter .csv to trigger only for CSV files.
- Click Add to save the trigger.
Screenshot of Amazon Lambda trigger creation page
Output:
Screenshot of the created json file in the output folder of users-feedback-analysis-bucket
Screenshot of final JSON output with sentiment analysis results saved in the S3 bucket
Conclusion
By combining the power of AWS Lambda and Amazon Comprehend, this serverless sentiment analysis pipeline provides an efficient and automated way to understand customer feedback. It empowers businesses to act on insights faster and build stronger customer relationships.
This project demonstrates how serverless technologies can simplify AI integration, making advanced capabilities like sentiment analysis accessible to all. Ready to turn customer feedback into actionable insights? Start building today!