Why Data needs to be collected
The key output of the Measure phase is to have a good understanding of the current process performance level with the figures such as Sigma level and some other process-specific metrics
Types of Data
Data can be grouped into discrete data or continuous data:
Discrete Data
Discrete data is categorical in nature. It falls into 3 categories: ordinal, nominal or binary.
Ordinal Data – The numbers/symbols are qualitative in nature, but they are also ranked. Central tendencies with ordinal data are measured by either the mode or the median. For example, student test scores can be expressed in ordinal fashion via grades A, B, C, D and F
Nominal Data – The numbers/symbols are assigned to each category but they don’t provide any information if the data is better or worst than other data in the listing. For example, 1. is assigned to “product that is produced in Thailand” for a company, and 2. is assigned to “product that is produced in Malaysia”; the no, of 1s and 2s produced by the company does not tell if one of them is better than the other.
Binary Data – The numbers/symbols assigned to the data has only 2 states. In the student test results example, a Pass/ Fail can be assigned to each student’s result
Note: Discrete data are best displayed via Pareto chart, Pie chart and Bar Chart
Continuous Data
Continuous data is quantitative data and is measured in units. For example, the time of day is measured in hours.
Note:
- Almost all continuous data can be converted into percentage
- Continuous data is best visualized in graph using Histogram and box plots.
Choosing between Discrete and Continuous data
Discrete or Continuous data may be chosen depending on the purpose of measurements. In the nutshell, Discrete data is easier to collect, but it provides less information compare to the Continuous data.
The Continuous data is typically more time-consuming to collect than Discrete data unless teams have access to automated or computerized data collection.
Where possible Continuous data would be preferred over Discrete data as:
- It provides more information than discrete data
- It is more precise than discrete data
- It reduces the variation and errors inherent in estimation and rounding (that happens in some Discrete data)
