How do missing values work in Crunch?
Survey researchers often have several related ways to mark that a response is missing. For instance you might a respondent might not be asked a particular question or they might not answer the question if they are asked. As a result the survey responses will include "Not Asked" and "Not Answered" along with any valid choices which the respondent might make. When you load this data into Crunch you might see a variable card like this:
Since we haven't marked the categories as missing, Crunch is treating "Not Asked" and "Not Answered" as though they were valid categories, and so the number of missing values is zero. We want to keep the distinction between the two options but mark them both as missing. This can be done through the Variable Properties menu.
Missingness is a property of the underlying data, and so it propagates correctly through all aspects of the Crunch app. After categories are marked as missing they are hidden from the variable card, and shown in the missing count at the bottom right of the card.
Additionally, any analyses and multi-tables will omit missing categories and will properly account for missing values when calculating percentages.
What about empty values?
Missingness is a property of the data and so you should only mark a category as missing if it truly represents a missing value, but sometimes a survey contains categories which are not missing, but are still not worth showing on an analysis. For instance a category might not have been chosen by any respondent, or for whatever reason it might not be something that you want to highlight to a client. There are two basic ways to handle this type of data.
Toggle the Show Empty setting
Some variables have lots of categories which have no responses. This can lead to needlessly cluttered tables when cross-tabs are produced with those variables. For instance this cross-tab includes a large number of 0s which are likely not that interesting to the client.
Crunch allows you to hide these values by toggling the "Show Empty" display option. Since this is a display setting rather than a feature of the data you need to set this options on each graph and analysis that you construct with the variables.
The second way of handling variables that have many uninteresting categories is to derive a new variable which combines those sparse categories into larger, more meaningful buckets. Taking the example above we could create a new variable which collapses the countries with no responses into a single "other" category.