Most of us have been to a conference and were asked to give feedback at the end. We filled out that little green sheet of paper that asked us to rate things like:
The location of this conference was convenient.
The food met my dietary needs.
The presentations were relevant.
The speakers were experts on their topics
Below each question the response options look something like this:
Strongly Disagree
Disagree
Neither Agree Nor Disagree
Agree
Strongly Agree
If you’ve answered a question like this, then you’ve provided data through a Likert scale. Likert scales are often used in questionnaires and surveys to measure a person's opinion or attitude towards a certain topic. When they are well written, they are easy to answer and provide great data for quick analysis and simple summaries.
We mentioned Likert scales in our posts about simple statistics (mean, median, and mode) and levels of measurement because it’s a commonly used ordinal measure that we sometimes treat differently from other ordinal measures (i.e. we calculate means with them).
Collecting Data with Likert Scales at Your Organization
If your nonprofit wants to incorporate Likert scales into your data collection, or if you want to improve what you are already doing, then the tips below will help you develop great Likert scales.
1. Word the Prompt or Question to Work with Likert Scales
This one feels silly, right? It's basically "ask a good question", and that's pretty close to correct.
It should have ordinal response options (i.e. they can be logically ordered/ranked from low to high, bad to good, etc)
It should NOT be a math question
It should NOT be an open ended question
It should NOT be a yes/no, true/false (i.e. Boolean) question
Bad Examples:
Which of these choices is your favorite color? Red, Blue, Green (nominal data - can't be ordered)
Which of the following does 5 + 5 equal? 10, 12, 33 (math question)
Describe how your favorite band makes you feel? (open ended)
Are you afraid of the dark? (yes/no)
2. Create a Question or Prompt that is Easy to Answer with a Likert Scale
There are two common approaches to writing a Likert question: Prompt Style and Question Style.
Prompt Style
We make a simple statement. The person responds by choosing from the set of response options we provide (e.g. strongly disagree to strongly agree).
Prompt Style is short and sweet, but some might think we are introducing bias or trying to sway people to respond in a certain way.
Examples of Prompt Style:
The computer training program taught me job relevant skills.
The food pantry has food that my family enjoys.
TIP: State your prompts positively, as shown in the examples above. Negatively framed prompts are difficult to comprehend and answer. Ex: The computer training program did NOT teach me job relevant skills. If someone selects "strongly agree", then that might feel like a positive response. However, the person is am really saying the program didn't teach them job relevant skills.
Question Style
We ask a question which might include the “good” and “bad” sides of the response range. The person responds by choosing from the set of response options we provide.
Question style can require longer statements, but it can also feel more balanced.
Examples of Question Style:
How would you rate your health?
How satisfied or dissatisfied with the food options in the food pantry?
TIP: Use simple language (i.e. short sentences, no fancy words or jargon) and resist the urge to explain your prompt or question.
3. Creating Response Options
We need to create response options for our Likert scale question. Here are some things to keep in mind to make sure the responses work well.
A: The responses can be logically ordered those categories
Strongly Agree, Agree, Neither Agree Nor Disagree, Disagree, Strongly Disagree
Not At All Helpful, A Little Helpful, Helpful, Very Helpful
Very Dissatisfied, Dissatisfied, Neither Satisfied Nor Dissatisfied, Satisfied, Very Satisfied
B: Assign a number value to each category
Assigning a number value to each response allows us to calculate a median and mean.
The numbers should follow a logical order.
Increase (or decrease) by 1 unit per response. (Ex: 1,2,3,4,5 is great, but NOT 1,2,4,5,9 and NOT 1,3,5,7,9).
For Example:
Strongly Disagree
Disagree
Neither Agree Nor Disagree
Agree
Strongly Agree
TIP: Make the more positive/supportive response have the highest value. That makes it easier to interpret the value – especially when you are calculating means, making comparisons of different questions, or comparing the same question over time.
C: The “gap” or “distance” between the categories should feel roughly equal.
When we walk stairs, it's best for each stair to be the same height so that each step is predictable and comfortable. Similarly, the response options in our Likert scale need to be the same “distance” so that each one is predictable and comfortable.
Example of Uneven Gaps Between Responses
Dislike
Somewhat Dislike
Neutral
Somewhat Like
Love it more than anything else in the world
In the example above, the “gap” between "Somewhat Like" and “Love it more than anything in the world” feels larger than the gaps between the other response options. This would make a respondent second-guess the best answer.
Example of of Even Gaps Between Responses
Hate It
Strongly Dislike
Dislike
Somewhat Dislike
Neutral
Somewhat Like
Like
Strongly Like
Love It
The gap or distance between each response feels roughly equivalent. The respondent can find the response that best fits their opinion.
TIP: We're using the word "feel" to determine whether the gaps between are even. It's a judgment call. We can't easily measure the distance between "dislike" and "somewhat dislike", so we have to rely on our judgment and feedback from others to help us make the best choices.
D. The response options should be balanced (unless we can’t)
What if the response options looked like this?
Disagree
Neutral
Agree
Strongly Agree
Very Strongly Agree
This set of responses is out of balance. There is only one negative response (Disagree) and three positive responses (Agree, Strongly Agree, Very Strongly Agree). This imbalance makes it seem like we are pushing people to respond in a certain way, we might actually push people to respond in a certain way (so our data would be garbage), or people might skip the question because they don't trust it.
So, keep the response options balanced...unless we can’t.
Sometimes there’s no obvious way to balance out the answer options. We see that often on the “negative” side of the response options.
For example: Please rate the keynote speaker:
Uninformative
Somewhat Informative
Informative
Very Informative
The response “very uninformative” would express a more extreme position than “uninformative” but how do we make sense of it? If the respondent didn’t learn anything from a speaker (i.e. they were uninformative) would “very uninformative” mean the speaker made the respondent worse at their job, or doe it mean the speaker presented misinformation or disinformation? If we had a "very uninformative" option, how would we interpret responses that were simply "uninformative"? Did the person learn a tiny bit?
This apparently unbalanced Likert scale is still appropriate because the more extreme negative response wouldn’t make much sense. If the unbalanced responses feels uncomfortable, we can simply change the prompt and responses.
Prompt: The keynote speaker was informative. Responses: Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree
E. Fewer Choices is Often Better
We can make Likert scales that include 4, 5, 7, 9, or more answer options. However, I recommend keeping the list of response options short unless we have a compelling reason to make it longer. Why? I have rarely found that it was helpful to have more than 5 response options, and longer scales can make it harder for us to understand our data quickly.
STOP! Below, I ramble on and on about why a 1-5 Likert scale is probably usually best. Basically, 1) more choices are harder to interpret quickly, 2) categorical data is imprecise so why bother, and 3) we're probably just going to want to simplify the data anyway. Skip to the next section if you want to!!! 1) It’s more difficult to get a quick understanding of the pattern in the data when we have lots of options. I used a spreadsheet to randomly generate 50 responses in Table 1 and 2 below. Lets assume that the lowest value is “bad” and the highest value is “good”. By “bad” we mean that people didn’t like the food, the presentations weren’t relevant, the program did not help the person achieve their goals, etc. Table 1: Sample Likert Scale Data with 1-5 Scale. 1=bad and 5=good
Choice | Count | Percent |
1 | 10 | 20% |
2 | 13 | 26% |
3 | 6 | 12% |
4 | 9 | 18% |
5 | 12 | 24% |
Table 2: Sample Likert Scale Data with 1-9 Scale. 1=bad and 9=good
Choice | Count | Percent |
1 | 6 | 12% |
2 | 1 | 2% |
3 | 8 | 16% |
4 | 6 | 12% |
5 | 9 | 18% |
6 | 5 | 10% |
7 | 3 | 6% |
8 | 3 | 6% |
9 | 9 | 18% |
Looking at Table 1 with the 1 to 5 scale, we can quickly get a sense of the data. With a small amount of effort, we can see that almost half (20% + 26% = 46%) of the responses are “bad” and a similar share of responses (18% + 24% = 42%) are good. The rest are in the middle. We can glance at that table and have a strong sense of the big picture. It’s a lot more work to interpret the data in Table 2. We can’t just glance at that data and have a sense of what’s going on. We have to pull out a calculator, and add up the numbers (...well, I need a calculator anyway). Bad: 12% + 2% + 16% + 12% = 42% Good: 10% + 6% + 6% + 18% = 40% Those extra categories Table 2 make it harder for us to quickly find the story in our data. 2) Likert scales are categorical data. There’s no perfect underlying value. It's probably not worth it to add lots of categories when they are "fuzzy" anyway. Think about the most common Likert scales you see – the 5 option, strongly disagree to strongly agree. Each category has a rough meaning.
Strongly Disagree
Disagree
Neither Agree Nor Disagree
Agree
Strongly Agree
The respondent is basically saying “I’m on the agree (or disagree) side" and then picking the extent to which they feel that way– a little or a lot. We could insert 2 more choices - Very Strongly Disagree and Very Strongly Agree. Yes, we are adding a little bit of nuance and maybe better precision to the answers, but how much can we really learn from it?
Very Strongly Disagree
Strongly Disagree
Disagree
Neither Agree Nor Disagree
Agree
Strongly Agree
Very Strongly Agree
We get subtler shades of "I agree a little, I agree some, I agree a lot". That's great data IF we can analyze it effectively, but often we just want to understand the big picture. 3) We're probably going to do a few basic analyses of our data (described below), but more often than not, we just want to know how it breaks down into “good”, “bad” and “neutral” categories. Ex: The percentage of people who liked the program, disliked the program, and didn’t have a strong opinion. Having more answer options means we have more response categories to place into the good group, bad group, and neutral group. It’s not necessarily more work to do that. But, if we are simplifying the data anyway, all of those response categories are not necessarily helpful. In sum, keep things simple for you and your audience and use a 1-5 Likert scale unless you have a good reason to use something else.
Analyzing Likert Scale Data
There are 3 simple ways to analyze our data before you dig in to more advanced analysis.
1. Measures of Central Tendency
We can use mean, median, and mode with Likert scales.
Mean
Even though a Likert scale is the ordinal level of measurement - which typically is not used with means- you will often see people calculate means with Likert scale data. It’s particularly useful to calculate a mean when we want to compare compare different questions or compare data over time. You can do this identify trends in your data (EX: do people seem to like the format of our conference more or less over time). Looking at Table 4 below, we can calculate the mean using a weighted average using the underlying numeric value assigned to each response. The mean is 3.6 (calculated below).
( (1 X 30) + (2 X 35) + (3 X 23) + (4 X 72) + (5 X 36) ) / 250 = 3.6
Median
Likert scales provide ordinal data. The data can be ordered based on the numeric value we assign to each category, and we can identify the middle value. Therefore, Likert scales are compatible with the median.
In Table 3, below, the median value is Agree (4).
Mode
It’s helpful to see the mode (or most common response) to your Likert scale data. If you use count and percentages (covered next) you’ll uncover the mode without additional effort.
In Table 4, below, we can see that the mode is Strongly Agree (5).
Interestingly, in this case, the mean, median, and mode all fall in different categories. The mean is half-way between Neutral and Agree (at 3.6), the median falls firmly in the Agree category (4), and the mode is Strongly Agree (5).
Table 3: Sample Data from a 1-5 Likert Scale with Responses ranging from Strongly Disagree (1) to Strongly Agree (5).
Choice | Count | Percent |
1. Strongly Disagree | 30 | 12% |
2. Disagree | 35 | 14% |
3. Neutral | 23 | 9% |
4. Agree | 72 | 29% |
5. Strongly Agree | 90 | 36% |
2. Counts and Percentages
Perhaps the most informative way to view your Likert scale data is to summarize the counts and percentages for each response category. Table 3 (above) demonstrates how simple counts and percentages can be used to understand the pattern of the data. We can see that 12% of respondents Strongly Disagree, 14% Disagree, 9% are Neutral, 29% Agree, and 36% Strongly Agree.
With this breakdown, we can see that most of our responses are positive (Agree and Strongly Agree) and large percent are also negative (Disagree and Strongly Disagree). This approach also helps us quickly determine the median (4. Agree) and mode (5. Strongly Agree).
3. Simplify Your Data with Red/Yellow/Green Groups
Quite often, we just want to see if people are overall happy or not, satisfied or not. It is useful to simplify our Likert scale data into Red/Yellow/Green groups because it’s so simple to examine the data. It’s especially useful for comparisons over time when the big idea (are people happier or not, satisfied or not?) is what we care about the most. Red, Yellow, and Green are big categories that we can use. These labels feels better or less judgmental than labeling people as the “Good Group” “Neutral Group”, “Bad Group”, but that's essentially what we are doing. We converted Table 3 (above) to Red/Yellow/Green to simplify the data. The results of that are shown in Table 4. To get each group, we combined responses into Red (Disagree, Strongly Disagree), Yellow (Neutral), and Green(Agree, Strongly Agree) groups.
Table 4: Likert Scale Data Summarized into Red, Yellow, Green Categories
Group | Count | Percent |
Red | 65 | 26% |
Yellow | 23 | 9% |
Green | 162 | 65% |
With Table 4, it's easy to see the overall pattern in our data. Nearly two-thirds of our respondents (65%) fall in the Green group, while only about one-fourth (26%) fall in the Red group. The rest, 9%, are in the Yellow group.
If we want to know whether our programs are impactful or if our conference was well received, the Red/Yellow/Green approach gives us the fastest snapshot.
We have stripped away the nuance in the data, so this isn't the best approach for detailed analysis. But, it does help us understand our work in a simple, easy to comprehend manner.
4. Demographic Analysis (Cross Tabulation, Pivot Tables)
We wrote about using demographic analysis with your data in other posts, so we won't repeat that here.
In short, demographic analysis helps us understand if our programs are effective, well received, and/or valued by different groups. Do single moms, people with full time jobs, people without high school diplomas think our programs are effective, relevant, appropriate compared to other groups? With that knowledge, we can make adjustments (or pat ourselves on the back).
Getting Started
Review the Likert scale questions and prompts that you already use to ensure that you're following the guidance provided here. Do your questions make sense for Likert scales? Are your responses balanced and do they have even "gaps" between them? Are you using too many response categories when they aren't helpful?
Finally, are you performing the simple analysis suggested to make sense of your data? Sure, we collect a lot of data because we have to, but you may as well make the most of it. If you're not ready for demographic analysis, start Red/Yellow/Green groupings.
Learn More About Nonprofit Data Management
This post is part of our nonprofit data bootcamp series. Check out the complete list of nonprofit data bootcamp topics with links to other published posts.
Reporting your impact is hard when you’re juggling spreadsheets. countbubble makes it easy so you can focus on your mission.
countbubble is simple, flexible data collection and case management software. It is ideal for small nonprofits. Email us contact@countbubble.com or sign up for email updates on blog posts, product news, or scheduling a demo.
Founder, CountBubble, LLC
コメント