Qualitative Data Analysis and Strategies
![Testing Assumptions for Statistical Tests Testing Assumptions for Statistical Tests](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1gDaZ3qf5gaNCw63NG-takyoASRZmE3CPvgDzncJuWIaevgD3LnHvJ0uZhaMcrIGqfaJ22v9Xdcdv39_VF7KGd3n5mYk2I-3SL_j3aRBQ8JPcTrrQ-0Du4j3UMRaYh41gtMkcCB4Uoma3_lJFUy_oawJUg07LY-8TFSuC9rJerxvx0-to9RaV0Zfn/w640-h320/Quantitative%20Data%20Analysis%20and%20Strategies%20(1).png)
Overview of Statistical Tests
Most statistical tests are based on a number of assumptions conditions that are presumed to be true and, when violated, can lead to misleading or invalid results. For example, parametric tests assume that variables are distributed normally. Frequency distributions, scatter plots, and other assessment procedures provide researchers with information about whether underlying assumptions for statistical tests have been violated.
Frequency distributions can reveal
whether the normality assumption is tenable; Graphic displays of data values
indicate whether the distribution is severely skewed, multimodal, too peaked
(leptokurtic), or too flat (platykurtic). There are also statistical indexes of
skewness or peakeness that statistical programs can compute to determine
whether the shape of the distribution deviates significantly from normality.
Performing Data
Transformations Raw data entered directly onto a computer file often need to be
modified or transformed before hypotheses can be tested. Various data
transformations can easily be handled through commands to the computer. All
statistical software packages can create new variables through arithmetic
manipulations of variables in the original data set. We present a few examples
of such transformations, covering a range of realistic situations.
• Performing item reversals. Sometimes response codes to certain variables need to be reversed (ie, high values becoming low, and vice versa) so that items can be combined in a composite scale. For example, the widely used CES-D scale consists of 20 items, 16 of which are statements indicating negative feelings in the prior week (eg, item 9 states, “I thought my life had been a failure”), and four of which indicate positive feelings (eg, item 8 states, “I felt hopeful about the future”).
The positively worded items must be reversed before items are added together. CES-D items have four response options, from 1 (rarely felt that way) to 4 (felt that way most days). To reverse an item (ie, to convert a 4 to a 1, and so on), the raw value of the item is subtracted from the maximum possible value, plus 1.
In SPSS, this can be accomplished through the “Compute” command
, which could be used to set the value of a new variable to 5 minus the value
of the original variable; for example, a new variable CESD8R could be computed
as the value of 5 CESD8, where CESD8 is the original value of item 8 on the
CESD scale.
• Constructing
scales. Transformations are also used to construct composite scale variables,
using responses to individual items. Commands for creating such scales in
statistical software packages are straightforward, using algebraic conventions.
In SPSS, the “Compute” command could again be used to create a new variable;
for example, a new variable STRESS could be set equal to Q1 Q2 Q3 Q4 Q5.
• Performing counts. Sometimes composite indexes are created when researchers want a cumulative tally of the occurrence of some attribute. For example, suppose we asked people to indicate which types of illegal drug they had used in the past month, from a list of 10 options.
Use of each drug would be answered independently
in a yes (coded 1) or no (coded 2) fashion. We could then create a variable
indicating the number of different drugs used. In SPSS, the “Count” command
would be used, creating a new variable (eg, DRUGS) equal to the sum of all the
“1” codes for the 10 drug items. Note that counting is the approach used to
create missing values flags, described earlier in this chapter.
• Recording variables. Other transformations involve recoding values to change the nature of the original data. For example, in some analyses, an infant's original birth weight (entered on the computer file in grams) might be used as a dependent variable. In other analyses, however, the researcher might be interested in comparing the subsequent morbidity of low-birth-weight versus normal-birth-weight infants.
For example, in SPSS, the “Recode Into Different
Variable” command could be used to recode the original variable (BWEIGHT) into
a new dichotomous variable with a code of 1 for a low birth-weight infant and a
code of 2 for a normal birth-weight infant, based on whether BWEIGHT was less
than 2500 grams.
• Meeting statistical assumptions. Transformations also can be undertaken to render data appropriate for statistical tests. For example, if a distribution is markedly non-normal, a transformation can sometimes be done to make parametric procedures appropriate. A logarithmic transformation, for example, tends to normalize distributions.
In SPSS, the “Compute” command could be used to
normalize the distribution of values on family income (INCOME), for instance,
by computing a new variable (eg, INCLOG) set equal to the natural log of the
values on INCOME. Discussions of the use of transformations for changing the
characteristics of a distribution can be found in Dixon and Massey (1983) and
Ferketich and Verran (1994).
• Creating dummy
variables. Data transformations may be needed to convert codes for multivariate
statistics. For example, for dichotomous variables, researchers most often use
a 0-1 code (rather than say, a 1-2 code) to facilitate interpretation of
regression coefficients. Thus, if the original codes for gender were 1 for
women and 2 for men, men could be recoded to 0 for a regression analysis.
Performing additional peripheral analyses
Depending on the
study, additional peripheral analyzes may be needed before proceeding to
substantive analyses. It is impossible to catalog all such analyses, but a few
examples are provided to alert readers to the kinds of issues that need to be
given some thought.
• Data pooling. Researchers sometimes obtain data from more than one source or from more than one type of subject. For example, to enhance the generalizability of their findings, researchers sometimes draw subjects from multiple sites, or may recruit subjects with different medical conditions.
The risk in doing this is
that subjects may not really be drawn from the same population, and so it is
wise in such situations to determine whether pooling of data (combining data
for all subjects) is warranted. This involves comparing the different subsets
of subjects (ie, subjects from the different sites, and so on) in terms of key
research variables
• Cohort effects. Nurse researchers sometimes need to gather data over an extended period of time to achieve adequate sample sizes. This can result in cohort effects, that is, differences in outcomes or subject characteristics over time. This might occur because sample characteristics evolve over time or because of changes in the community, in families, in health care, and so on.
aIf the research involves an
experimental treatment, it may also be that the treatment itself is modified
for example, if early program experience is used to improve the treatment or if
those administering the treatment simply get better at doing it. Thus,
researchers with a long period of sample intake should consider testing for
cohort effects because such effects can confound the results or even mask
existing relationships. This activity usually involves examining correlations
between entry dates and key research variables.
• Ordering effects.
When a crossover design is used (ie, subjects are randomly assigned to
different orders of treatments), researchers should assess whether outcomes are
different for people in the different treatment-order groups.
• Manipulation checks. In testing an intervention, the primary research question is whether the treatment was effective in achieving the intended outcome. Researchers sometimes also want to know whether the intended treatment was, in fact, received. Subjects may perceive a treatment, or respond to it, in unanticipated ways, and this can influence treatment effectiveness. Therefore, researchers sometimes build in mechanisms to test whether the treatment was actually in place.
aFor example, suppose we were testing the effect of noise levels on
stress, exposing two groups of subjects to two different noise levels in a
laboratory setting. As a manipulation check, we could ask subjects to rate how
noisy they perceived the settings to be. If subjects did not rate the noise
levels in the two settings differently, it would probably affect our
interpretation of the results—particularly if stress in the two groups turned
out not to be significantly different.
Principal Analysis
At this point in
the analysis process, researchers have a cleaned data set, with missing data
problems resolved and needed transformations completed; they also have some
understanding of data quality and the extent of biases. They can now proceed
with more substantive data analyses.
The Substantive Data Analysis Plan
In many studies, researchers collect data on dozens, and often hundreds, of variables. They cannot realistically analyze every variable in relation to all others, and so a plan to guide data analysis must be developed. Research hypotheses and questions provide only broad and general direction. One approach is to prepare a list of the analyzes to be undertaken, specifying both the variables and the statistical test to be used. Another approach is to develop table shells.
Table shells are layouts of how researchers envision presenting the research findings in a report, without any numbers in the table. Once a table shell has been prepared, researchers can undertake analyzes to fill in the table entries. Table 22-1 presents an example of an actual table shell created for an evaluation of an intervention for low-income women.
This table guided a series of ANCOVAs that compared experimental and control groups in terms of several indicators of emotional well-being, after controlling for various characteristics measured at random assignment. The completed table that eventually appeared in the research report was somewhat different than this table shell (eg, another outcome variable was added). Researchers do not need to adhere rigidly to table shells, but they provide an excellent mechanism for organizing the analysis of large amounts of data.
Substantive Analysis
The next step is to perform the actual substantive analyses, typically beginning with descriptive analyses. For example, researchers usually develop a descriptive profile of the sample, and often look descriptively at correlations among variables. These initial analyzes may suggest further analyzes or further data transformations that were not originally envisioned. They also give researchers an opportunity to become familiar with their data. Researchers then perform statistical analyzes to test their hypotheses.
Researchers whose data analysis plan calls
for multivariate analyzes (eg, multivariate analysis of variance [MANOVA]) may
proceed directly to their final analyses, but they may begin with various
bivariate analyzes (eg, a series of analyzes of variance [ANOVAs]). The primary
statistical analyzes are complete when all the research questions are addressed
and, if relevant, when all table shells have the applicable numbers in them
Interpretation Of Results
The analysis of
research data provides the results of the study. These results need to be
evaluated and interpreted, giving consideration to the aims of the project, its
theoretical basis, the existing body of related research knowledge, and
limitations of the research methods adopted. The interpretive task involves a
consideration of five aspects of the results: (1) their credibility, (2) their
meaning, (3) their importance, (4) the extent to which they can be generalized,
and (5) their implications.
Credibility of the results
One of the first interpretive tasks is assessing whether the results are accurate. This assessment, in turn, requires a careful analysis of the study's methodologic and conceptual limitations. Regardless of whether one's hypotheses are supported, the validity and meaning of the results depend on a full understanding of the study's strengths and shortcomings. Such an assessment relies heavily on researchers' critical thinking skills and on their ability to be reasonably objective.
Researchers should carefully evaluate the major methodological decisions they made in planning and executing the study and consider whether different decisions might have yielded different results. In assessing the credibility of results, researchers seek to assemble different types of evidence. One type of evidence comes from prior research on the topic.
Investigators should examine whether their results are consistent with those of other studies; if there are discrepancies, a careful analysis of the reasons for any differences should be undertaken. Evidence can often be developed through peripheral data analyses, some of which were discussed earlier in this chapter.
For example, researchers can have greater confidence in the accuracy of their findings if they have established that their measures are reliable and have ruled out biases. Another recommended strategy is to conduct a power analysis. Researchers can also determine the actual power of their analyses, to determine the probability of having committed a Type II error.
It is especially useful to perform a power analysis when the results of statistical tests are not statistically significant. For example, suppose we were testing the effectiveness of an intervention to reduce patients' pain. The sample of 200 subjects (100 subjects each in an experimental and a control group) are compared in terms of pain scores, using a t-test.
Suppose further that the mean pain score for the experimental group was 7.90 (standard deviation [SD] 1.3), whereas the mean for the control group was 8.29 (SD 1.3), indicating lower pain among experimental subjects. Although the results are in the hypothesized direction, the t-test was nonsignificant. We can provide a context for interpreting the accuracy of the nonsignificant results by performing a power analysis.
Meaning of the Results
In qualitative studies, interpretation and analysis occur virtually simultaneously. In quantitative studies, however, results are in the form of test statistics and probability levels, to which researchers need to attach meaning. This sometime involves supplementary analyzes that were not originally planned.
For example,
if research findings are contrary to the hypotheses, other information in the
data set sometimes can be examined to help researchers understand what the
findings mean. In this section, we discuss the interpretation of various
research outcomes within a hypothesis testing context.
Interpreting Hypothesized Results
Interpreting results is easiest when hypotheses are supported. Such an interpretation has been partly accomplished beforehand because researchers have already brought together prior findings, a theoretical framework, and logical reasoning in developing the hypotheses. This groundwork forms the context within which more specific interpretations are made. Naturally, researchers are gratified when the results of many hours of effort support their predictions.
There is a decided preference on the part of individual researchers, advisers, and journal reviewers for studies whose hypotheses have been supported. This preference is understandable, but it is important not to let personal preferences interfere with the critical appraisal appropriate to all interpretive situations. A few caveats should be kept in mind.
First, it is best to be conservative in drawing conclusions from the data. It may be tempting to go beyond the data in developing explanations for what results mean, but this should be avoided. An example might help to explain what we mean by “going beyond” the data. Suppose we hypothesized that pregnant women's anxiety level about labor and delivery is correlated with the number of children they have already borne.
The data reveal that a significant negative relationship between anxiety levels and parity (r .40) exists. We conclude that increased experience with childbirth results in decreased anxiety. Is this conclusion supported by the data? The conclusion appears to be logical, but in fact, there is nothing in the data that leads directly to this interpretation. An important, indeed critical, research precept is: correlation does not prove causation.
The finding that two variables are related offers no evidence suggesting which of the two variables—if either—caused the other. In our example, perhaps causality runs in the opposite direction, that is, that a woman's anxiety level influences how many children she bears. Or perhaps a third variable not examined in the study, such as the woman's relationship with her husband, causes or influences both anxiety and number of children.
Alternative explanations for the findings should always be considered and, if possible, tested directly. If competing interpretations can be ruled out, so much the better, but every angle should be examined to see if one's own explanation has been given adequate competition. Empirical evidence supporting research hypotheses never constitutes proof of their veracity. Hypothesis testing is probabilistic.
There is always a
possibility that observed relationships resulted from chance. Researchers must
be tentative about their results and about interpretations of them. In summary,
even when the results are in line with expectations, researchers should draw
conclusions with restraint and should give due consideration to limitations
identified in assessing the accuracy of the results
Interpreting Nonsignificant Results Failure to reject a null hypothesis is problematic from an interpretative point of view. Statistical procedures are geared toward disconfirmation of the null hypothesis. Failure to reject a null hypothesis can occur for many reasons, and researchers do not know which one applies. The null hypothesis could actually be true, for example.
The nonsignificant result, in this case, accurately reflects the absence of a relationship among research variables. On the other hand, the null hypothesis could be false, in which case a Type II error has been committed. Retaining a false null hypothesis can result from such problems as poor internal validity, an anomalous sample, a weak statistical procedure, unreliable measures, or too small a sample.
Unless the researcher has special justification for attributing the nonsignificant findings to one of these factors, interpreting such results is tricky. We suspect that failure to reject null hypotheses is often a consequence of insufficient power, usually reflecting too small a sample size.
For this reason, conducting a power analysis can help researchers in interpreting nonsignificant results, as indicated earlier. In any event, researchers are never justified in interpreting a retained null hypothesis as proof of the absence of relationships among variables.
Nonsignificant results provide no evidence of the truth or the falsity of the hypothesis. Thus, if the research hypothesis is that there are no group differences or no relationships, traditional hypothesis testing procedures will not permit the required inferences.When significant results are not obtained, there may be a tendency to be overcritical of the research methods and under critical of the theory or reasoning on which hypotheses were based.
This is understandable: It is easier to say, “My ideas were sound, I just didn't use the right approach,” than to admit to faulty reasoning. It is important to look for and identify flaws in the research methods, but it is equally important to search for theoretical shortcomings. The result of such endeavors should be recommendations for how the methods, the theory, or an experimental intervention could be improved.
Interpreting Un-Hypothesized Significant Results
Unhypothesized significant results can occur in two situations. The first involves finding relationships that were not considered while designing the study. For example, in examining correlations among variables in the data set, we might notice that two variables that were not central to our research questions were significantly correlated and interesting.
To interpret this finding, we would need to evaluate whether the relationship is real or spurious. There may be information in the data set that sheds light on this issue, but we might also need to consult the literature to determine if other investigators have observed similar relationships. The second situation is more perplexing: obtaining results opposite to those hypothesized.
For instance, we might hypothesize that individualized teaching about AIDS risks is more effective than group instruction, but the results might indicate that the group method was better. Or a positive relationship might be predicted between a nurse's age and level of job satisfaction, but a negative relationship might be found. It is, of course, unethical to alter a hypothesis after the results are "in."
Some researchers see such situations as awkward or embarrassing, but there is little basis for such feelings. The purpose of research is not to corroborate researchers' notions, but to arrive at truth and enhance understanding. There is no such thing as a study whose results “came out the wrong way,” if the “wrong way” is the truth. When significant findings are opposite to what was hypothesized, it is less likely that the methods are flawed than that the reasoning or theory is incorrect.
As always, the
interpretation of the findings should involve comparisons with other research,
a consideration of alternate theories, and a critical scrutiny of data
collection and analysis procedures. The result of such an examination should be
a tentative explanation for the unexpected findings, together with suggestions
for how such explanations could be tested in other research projects.
Interpreting mixed results
Interpretation is often complicated by mixed results: Some hypotheses are supported by the data, whereas others are not. Or a hypothesis may be accepted when one measure of the dependent variable is used but rejected with a different measure. When only some results run counter to a theoretical position or conceptual scheme, the research methods are the first aspect of the study deserving critical scrutiny.
Differences in the validity and reliability of the various measures may account
for such discrepancies, for example. On the other hand, mixed results may
suggest that a theory needs to be qualified, or that certain constructs within
the theory need to be re-conceptualized. Mixed results sometimes present
opportunities for making conceptual advances because efforts to make sense of disparate
pieces of evidence may lead to key breakthroughs.
Importance of the results
In quantitative studies, results that support the researcher's hypotheses are described as significant. A careful analysis of study results involves an evaluation of whether, in addition to being statistically significant, they are important. Attaining statistical significance does not necessarily mean that the results are meaningful to nurses and their clients.
Statistical significance indicates that the results were unlikely to be a function of chance. This means that observed group differences or relationships were probably real, but not necessarily important. With large samples, even modest relationships are statistically significant. For instance, with a sample of 500, a correlation coefficient of .10 is significant at the .05 level, but a relationship this weak may have little practical value.
Researchers must pay attention to the numerical values obtained in an analysis in addition to significance levels when assessing the importance of the findings. Conversely, the absence of statistically significant results does not mean that the results are unimportant although because of the difficulty in interpreting nonsignificant results, the case is more complex. Suppose we compared two alternative procedures for making a clinical assessment (eg, body temperature).
Suppose
further that we retained the null hypothesis, that is, found no statistically
significant differences between the two methods. If a power analysis revealed
an extremely low probability of a Type II error (eg, power .99, a 1% risk of a
Type II error), we might be justified in concluding that the two procedures
yield equally accurate assessments. If one of these procedures is more
efficient or less painful than the other, nonsignificant findings could indeed
be clinically important.
Generalizability of the results
Researchers should also assess the generalizability of their results. Researchers are rarely interested in discovering relationships among variables for a specific group of people at a specific point in time. The aim of research is typically to reveal relationships for broad groups of people. If a new nursing intervention is found to be successful, others will want to adopt it.
Therefore, an important
interpretive question is whether the intervention will “work” or whether the
relationships will “hold” in other settings, with other people. Part of the
interpretive process involves asking the question, “To what groups,
environments, and conditions can the results of the study reasonably be
applied?”
Implications of the results
Once researchers have drawn conclusions about the credibility, meaning, importance, and generalizability of the results, they are in a good position to make recommendations for using and building on the study findings. They should consider the implications with respect to future research, theory development, and nursing practice. Study results are often used as a springboard for additional research, and researchers themselves often can readily recommend “next steps.”
Armed with an understanding of the study's limitations and strengths, researchers can pave the way for new studies that would avoid known pitfalls or capitalize on known strengths. Moreover, researchers are in a good position to assess how a new study might move a topic area forward. Is a replication needed, and, if so, with what groups? If observed relationships are significant, what do we need to know next for the information to be maximally useful?
For studies based on a theoretical or conceptual model, researchers should also consider the study's theoretical implications. Research results should be used to document support for the theory, suggest ways in which the theory ought to be modified, or discredit the theory as a useful approach for studying the topic under investigation.
Finally, researchers should carefully
consider the implications of the findings for nursing practice and nursing
education. How do the results contribute to a base of evidence to improve
nursing? Specific suggestions for implementing the results of the study in a
real nursing context are extremely valuable in the utilization process,
Conclusion
• Researchers who
collect quantitative data typically progress through a series of steps in the
analysis and interpretation of their data. The careful researcher lays out a
data analysis plan in advance to guide that progress.
• Quantitative data
must be converted to a form amenable to computer analysis through coding, which
typically transforms all research data into numbers. Special codes need to be
developed to code missing values.
• Researchers
typically document decisions about coding, variable naming, and variable
location in a codebook.
• Data entry is an
error-prone process that requires verification and cleaning. Cleaning involves
(1) a check for outliers (values that lie outside the normal range of values)
and wild codes (codes that are not legitimate), and (2) consistency checks.
• An important
early task in analyzing data involves taking steps to evaluate and address
missing data problems. These steps include deleting cases with missing values
(ie, listwise deletion), deleting variables with missing values, substitution
of mean values, estimation of missing values, and selective pairwise deletion
of cases. Researchers strive to achieve a rectangular matrix of data (valid
information on all variables for all cases), and these strategies help
researchers to attain this goal.
• Raw data entered
directly onto a computer file often need to be transformed for analysis.
Examples of data transformations include reversing of the coding of items,
combining individual variables to form composite scales, recoding the values of
a variable, altering data for the purpose of meeting statistical assumptions,
and creating dichotomous dummy variables for multivariate analyses.
• Before the main
analyzes can proceed, researchers usually undertake additional steps to assess
data quality and to maximize the value of the data. These steps include
evaluating the reliability of measures, examining the distribution of values on
key variables for any anomalies, and analyzing the magnitude and direction of
any biases.
• Sometimes
peripheral analyzes involve tests to determine whether pooling of subjects is
war ranted, tests for cohort effects or ordering effects, and manipulation
checks.
• Once the data are
fully prepared for substantive analysis, researchers should develop a formal
analysis plan to reduce the temptation to go on a “fishing expedition.” One
approach is to develop table shells, that is, fully laid-out tables without any
numbers in them.
• The interpretation
of research findings typically involves five subtasks: (1) analyzing the
credibility of the results, (2) searching for underlying meaning, (3)
considering the importance of the results, (4) analyzing the generalizability
of the findings, and (5) assessing the implications of the study regarding
future research, theory development, and nursing practice.Read More
Give your opinion if have any.