Hotjar + Contentsquare 🎉 Hotjar + Contentsquare: we’re joining forces to build better experiences for all. 🎉 Learn more
User research

How to analyze open-ended questions in 5 steps [template included]

How to analyze open-ended questions in 5 steps with a template included
open ended question analysis
The open-ended question analysis template by Hotjar

To help you learn this technique, we created a data sample that you can download and use to follow along.

Now let’s begin…

Table of contents

Step 1: get your data into the template

1) Export the data from your survey or poll into a .CSV or .XLS file.

survey export from hotjar as csv
Example of survey export from Hotjar as .CSV

2) Copy the data from your .CSV or .XLS file and paste it into the sheet ‘CSV Export’ of the template.

image7copy data from csv or xls
Copy the data you want to analyze from your .CSV or .XLS file
data after copy paste
This is what your data should you like after being copy-pasted in the ‘CSV Export’ sheet

3) Copy the column from the ‘CSV Export’ sheet containing the open-ended question you want to analyze first and paste it into the ‘Question 1’ sheet, in the cell marked with < Paste answers to first open-ended question here >.

copy column from csv export sheet
Your open-ended answers should you like the above after being copy-pasted in the ‘Question 1’ sheet

4) Choose wrap text for the entire column, so the data fits the column width and is easier for you to read later on.

choose wrap text
The 'wrap' function is available from the main menu in Google Sheets

Step 2: identify response categories

A response category is a set of replies that can be grouped because they are part of the same theme, even if they’re worded differently.

In the sample dataset we use for this tutorial, we asked Hotjar customers to explain how their employer measures their performance (e.g., revenue, conversions, traffic). In theory, you could go through every answer to identify your response categories one-by-one, but that wouldn’t be very efficient. Instead, we’re going to use a series of techniques that help you identify the broad categories.

A) Use a text analyzer: text analyzers take your data and analyze it for the most commonly used words in your text, which helps you identify broad categories of responses.

text analyzer
Copy and paste your data into Textalyser and click ‘Analyze the text’

If you do this with the sample data we’ve provided above, you’ll find that  ‘sales,’ ‘conversion,’ and ‘traffic’ are some of the most commonly used words in the data set:

top words frequency
'sales,’ ‘conversion,’ and ‘traffic’ are some of the most commonly used words in the data set and could be used as response categories

As such, they represent some of the most popular replies to the question we asked. They don’t represent all the answers, of course, but they’re a good place to start when building the list of response categories.

Add each category to the top of separate a separate column (replacing the text that reads, 'Response Category 01,' 'Response Category 02,' etc.):

response category
Add each response category to the top of the sheet in Row 2

B) Sort your responses alphabetically: when you sort alphabetically, you’ll notice that specific patterns emerge, and you can create more categories based on the trends you spot.

sort responses alphabetically
In Google Sheets, select the range, right-click, and sort the range alphabetically

In our sample data, every sentence beginning with the word 'Revenue' gets grouped when you sort alphabetically. Of course, we already have a category for 'Sales/Revenue,' so there’s no need to add that category in this case—but grouping the data alphabetically will allow groups to stand out.

sort responses alphabetically themes
Sorting responses alphabetically helps you to uncover themes easily

Alphabetical sorting will also draw your attention to certain stand-alone response. For example, someone replied 'Huh?' and another person told us they didn’t understand the question. This information allows us to add a new category called 'Didn’t understand the question.'

sort responses alphabetically stand alone response
Alphabetical sorting will also draw your attention to certain stand-alone response

Scan the alphabetically sorted responses for other categories, such as 'It’s not measured,' 'Traffic,' 'Conversions,' etc. Be on the lookout for synonyms, but don’t worry if you create a few redundant categories for now. You will combine the categories that mean the same thing at the end.

Step 3: record the individual responses

1) Place a '1' in each cell where a response (the row) matches a category (the column) to identify a positive response in each category. Add categories as you go.

For example, if you sorted our sample data alphabetically, you’ll find that the response in Row 6 reads, 'Huh?' If you added 'Did not understand the question' to Column E (as we did in the screenshot), then you’ll place a '1' in E36.

place a 1 in each cell
A '1' is placed in Column E, Row 36
multible response category
Some answers might fall into multiple response categories

When you input your first '1,' the cell in Row 3 (below the category) will change to indicate the number of positive responses in that category. Row 4 will change from a '#DIV/0' error to the percentage of responses that fall into each category.

2) Use the 'Find' feature to search for words related to each category: begin with the first category (in our example, that’s 'sales') and search the data column for any response that mentions 'sales.' Read the entire response to ensure it fits the category you searched for, then place a '1' in the appropriate column for that response.

find feature search words
Searching for the term 'sales' leads to finding 11 responses

3) Fill in the gaps: read each row that hasn’t been categorized and place a '1' under the appropriate category, creating new categories as necessary. As you create new categories, search your data for those terms to quickly find similar responses.

fill in the gaps trends
As you create new categories and fill in the gaps, some interesting trends will start to appear

Step 4: organize your categories

1) Group your data: you will almost certainly find categories that should be grouped but ended up in different categories because respondents used different words to describe the same concept. In our sample data, we found the terms 'Lead Gen' and 'Form Submissions,' and these belong in the same category.

Drag these columns next to each other, and apply a color (any color) to the group of columns you plan to merge—this marks them as a group so you can return to them in a bit when it’s time to combine them. Repeat this step for each set of categories you plan to join.

group data
Column K is dragged next to Column H because both response categories are related

Add a new column to the left-hand side of each group. For example, with 'Lead Gen' and 'Form Submissions,' you’ll create a new category called 'Lead Gen / Form Submissions,' add up the Row 3 totals for the two old categories, and enter the new total under the new group. Copy and paste the percentage formula from any Row 4 cell, then delete the old categories.

add new category
A new category called 'Lead Gen / Form Submissions' is added

Repeat this step for every group you plan to merge.

2) Arrange your categories from large to small: arrange your categories in descending order from left to right. For those that only contribute to a small percentage of the total (2% or less), use the grouping method above to merge them into one category called 'Others,' which you’ll leave on the far right.

merge miscellaneous categories together
A new category called 'Others' is added to merge miscellaneous categories together

Step 5: represent your data visually

1) Prep your data to create a bar chart. First, select and copy the top three rows of your spreadsheet (those that make up the 'Response Categories,' 'Total respondents who answered X,' and '% respondents who answered X').

prep data to create a bar chart
Select and copy the top three rows of your spreadsheet

Paste them into the ‘Graph Question 1’ sheet using the 'Paste special' feature to paste only the values (so the formulas don’t copy over).

paste values
Paste as values your selection in ‘Graph Question 1,’ Cell A3

Select and copy the table you just pasted, and choose 'Paste special' again—this time using 'Paste transposed' to invert the rows and columns (this makes your data more chart-friendly).

paste transposed
Select and copy the table you just pasted, and choose 'Paste special' again—this time using 'Paste transposed' in Cell A9

This is what you should see:

category percentage
Your table containing categories, the volume of responses, and percentage should you like the above

2) Create your chart: insert your chart, selecting the percentage column as your 'Series' and the categories as your 'X-axis.' Resize the chart however you see fit.

open ended questions visualization graph
Your open-ended answers are now visualized in a graph

And there you have it—a visual representation of your data! Feel free to experiment with different formats if you’re putting the chart into a formal presentation.

Analyzing open-ended questions efficiently and empathizing with your audience take some practice, but the more you do it, the easier it becomes. Your mind will begin to recognize patterns the more you practice this technique, so don’t be afraid to dive into it.

Hotjar's open-ended question analysis template

Want to efficiently analyze a large volume of qualitative data? Get our Google Sheets/Excel template to get started.

cta open ended question analysis template hotjar

Learn something new every month:
sign up to receive Hotjar content in your inbox.

Related content

Heatmaps, Recordings, Incoming Feedback, Surveys

Try Hotjar. It's free