SWYS - Survey Instrument
2001 Instrument Development, Reliability and
Validity
A question sometimes asked about the SWYS survey is
"how valid and reliable is it?" In other words, how accurate
is the information that was obtained? There is no simple answer
to this question. In this chapter, we will try to clarify some of
the relevant issues, and speculate about the data's accuracy and
limitations.
Validity is usually defined by the question "Are we measuring
what we intended to measure?" In other words, how accurate
is the measure at assessing a given behavior or belief? Reliability
refers to the consistency or reproducibility of a measure. If a
measure is not reliable, it will not even agree with itself. For
example, if students are administered a measure that has a low reliability
on two consecutive days, it is likely that their responses would
not be the same. Reliability is a necessary, but not sufficient
precondition for validity.
One way to increase the reliability and validity of a measure
is to use a well-established measure that has demonstrated reliability
and validity. Whenever possible this was done in the SWYS survey.
Many of the measures in the survey are established measures that
have demonstrated fairly high reliability and validity. For instance,
the depression measure that was is the short form of the Beck Depression
Inventory (1), one of the most widely used measures of depression.
Most of the drug and alcohol questions come from widely used national
survey instruments, as do the questions dealing with suicide.
It should also be noted that most of the measures developed specifically
for this survey have been examined for their reliability and validity.
Those survey items that did not measure up to this scrutiny were
either dropped or redesigned for this present survey.
Through reliability and validity grants of self-reported surveys,
it has been found that teenagers are more likely to lie than adults.
Of particular concern is self-reporting of age of first drug initiation.
It had been found that on average teenagers will vary within 2 years
of reporting when they first tried a particular drug. Males are
also more likely to vary their responses. The low reliability of
self-reported drug use has many implications on future research.
The most important being that many age "cut-offs" for
analysis purposes may not be as accurate as once thought (2). Thus
we have limited our questions on initiation of substance use to
alcohol and tobacco.
The reasons behind why teenagers lie are becoming clearer. Many
researchers now report that teenagers lie to be more socially acceptable.
To them, what they perceive as the norm for level of involvement
in a particular activity should be, is what they report on the surveys
(3,4). Typically, teenagers will inflate answers by 10% if given
numbers of times engaged in. For this reason, we have eliminated
questions that dealt with how times a week a student has sexual
intercourse.
While it is still being debated as to how to get students to report
honestly, some researchers suggest impressing how important it is
to tell the truth, which was emphasized through the training for
survey administrators. Other suggestions include: having reliability
checks within the surveys, controlling for social desirability as
much as possible, and stressing that results will be anonymous.
In order to detect a more sophisticated source of error, we compared
response patterns across related questions. Three such scales were
developed to check the reliability of responses to questions regarding
alcohol, tobacco and sexual intercourse. In all three cases, unambiguous
questions (e.g. Question 77: "Have you had sexual intercourse?")
were compared to more involved questions (e.g. Question 81: "If
you have had sexual intercourse, which best describes the use of
alcohol by yourself and/or your partner the first time you had sexual
intercourse"). Overall, our analysis indicated a high level
of consistency across the questions for these behaviors.
On the topic of sexual intercourse, 4692 participants stated they
had never had sexual intercourse (Question 77). A total of 4529
participants were consistent in their responses for all related
questions (78 through 81). This produced an inter-question reliability
above .96. For tobacco-related questions, 4131 participants stated
they had never smoked tobacco (Question 26) and 3815 participants
answered "no" to the related questions 41 and 42, giving
us a rating of above .94. Finally, 2791 participants answered they
had never used alcohol (Question 28) while 2918 participants stated
they do not drink (.95). Taken collectively, these measures suggest
the teens were motivated to remain consistent in their responses
throughout the survey.
Despite all prudent efforts, with any self-report survey aimed
at teenagers, there is always the possibility that a small percentage
of those surveyed will not take the survey seriously. Fortunately,
most teenagers who do not take the survey seriously are not subtle
with their responses. They typically exaggerate their responses
to such an extent that their surveys are easy to spot and remove.
Another question often asked about surveys of this type is how
representative are the findings for students in general? One factor
to keep in mind is that the survey only represents the responses
of students who in attendance on the day the survey was administered.
Studies have shown that students who are more frequently absent
or truant are also more likely to use illicit drugs, drink alcohol,
smoke, and engage in potentially problematic and dangerous activities
(5). As a result, the current findings are likely to be a slight
underestimate of the actual incidence of such problem behavior in
all youth who are currently enrolled in school. For drug use, Johnson
and O'Malley found that these behaviors were estimated from 1.4%
to 2.7% (5).
It should also be noted that the numbers presented in this report
reflect only adolescents enrolled in school, not those who have
dropped out. There is some evidence to indicate that school dropouts
are somewhat more likely than those enrolled in school to be users
of illicit drugs and alcohol and to engage in other problematic
behaviors. (6) Consequently, the numbers presented in this report
probably underestimate the actual incidences of alcohol and other
drug use for all teens in Southwest Wisconsin.
For a practical survey such as the present one, the issues of
reliability and validity are only a means to an end. The real question
is "How is the measure and the data it produces going to be
used?" If the objective is the diagnosis of a particular individual,
then the precision of the instrument is extremely important and
imprecision can be a problem. In contrast, if the objective is to
determine the prevalence of a particular behavior or behaviors for
a given population (our current interest) then greater imprecision
is usually tolerable. For instance, it will probably not matter
to school officials whether 25% or 30% of students are currently
engaged in using cocaine. We can assume that a 5% under-or over-estimate
will make little difference and that such a high incidence of cocaine
use would be viewed as a major problem.
_________________
(1) Beck, A.T., & Beck, R.W. (1972) Screening depressed patients
in family practice-A rapid technique. Postgraduate Medicine, 52,
81-85.
(2) Johnson, T.P., Mott, J.A. 2001. The reliability of self-reported
age of onset of tobacco, alcohol and illicit drug use. Addiction,
96, 1187-1198.
(3) Champion, D.J. (2001). Measuring delinquency. In The juvenile
justice system: Delinquency, processing and the law. (3rd ed). (pp.
40-88) Upper Saddle River, New Jersey: Prentice Hall.
(4) Rosenblatt, J.A., Furlong, M.J. 1997. Assessing the reliability
and validity of student self-reports of campus violence. Journal
of Youth and Adolescence, 26, 187-202.
(5) Johnson, L. & O'Malley, P. (1985). In B. Rouse, N. Kozel
& L. Richards (Eds.). Self-Report Methods of Estimating Drug
Use: Meeting Current Challenges to Validity. Rockville, MD: National
Institute on Drug Abuse.
(6) Clayton, R., & Voss, H. (1982). Technical Review on Drug
Abuse and Dropouts. National Institute on Drug Abuse. Washington,
D.C.: Superintendent of Documents, Us Government Printing Office.
Use of Scales in Reporting Data
A series of scales were developed in order to ascertain general
patterns for questions that are believed to measure a similar trait.
Aggregating scores in this manner helps "smooth" out some
of the noise in the system that can occur when individuals respond
to rather specific questions. For this study, we developed ten such
scales designed to measure dimensions such as parental rules, communication
with their teen, and monitoring; the teen's satisfaction with their
school and their involvement in the community; and a measure of
the teen's overall self-esteem. The range for scale was then categorized
into mutually exclusive and exhaustive quartiles for comparison.
In each case, the higher categories correspond to an increase in
the scale's dimension.
Six separate scales were developed to measure teen's description
of their relationships with their parents. For all scales, teenagers
who responded that a parent(s) was (were) absent from their home
were excluded from that data set. The first scale was Parental Monitoring.
This scale summed the responses from Questions 123 through 130.
Higher scores corresponded to teen's responding their parents had
greater awareness of the teen's on-going behaviors.
Four scales were developed to assess parental communication with
the teen. An overall measure of the teen's perception of their communication
with their mothers was determined by summing Questions 140 through
145. A sub-scale was also created to measure communication of sexual
themes by summing Questions 141 through 143. Scales were created
in a similar fashion to measure the teen's perception of their communication
level with the father both for overall themes (Questions 146 through
150) as well as sexually-related topics (Questions 147 through 149).
For all four scales, higher scores corresponded to higher levels
of communication.
The final parental based scale measured teen's perception of their
parents enforcement of house rules. This scale summarized the responses
to Questions 131 and 132 and higher scores corresponded to greater
consistency in house rules enforcement.
Three scales were developed to measure the teen's perception of
community involvement and satisfaction with the community and school.
The first scale measured the teen's total involvement in activities
outside school and was derived by summing Questions 110 through
120. Higher scores correspond to greater involvement in a variety
of activities. The second scale measured the teen's overall satisfaction
with their school and was computed by summing Questions 98 and 99
and subtracting from that total the sum of Questions 94 through
97. Higher scores correspond to greater satisfaction with the school.
The third scale measured the teen's satisfaction with their larger
community. Due to the nature of the questions and in order to keep
the scale consistent with the previous scales, this measure actually
measures the level of dissatisfaction with the community. This scale
was derived from the sum of Questions 102 through 107 and higher
scores reflect negative attitudes.
The final scale was a general measure of the teen's self-esteem.
It was computed by summing Questions 59 and 62 and subtracting the
sum of Questions 58, 60, 61 and 63. Higher scores correspond to
more positive reports of self-worth and positive attitudes about
themselves.
Generally the scales were used to determine if the issue they measured
made a difference in a particular behavior or attitude. For example,
it was determined that students whose parents score high on the
parental monitoring scale, were less likely to engage in smoking.
This type of information can be of value to parents open to learning
more parenting skills and tips.
Not every scale is used in this report. Occasionally quartiles
have been collapsed for easier presentation of the data or because
the number of students falling in a quartile was deemed too small
to give meaningful data.
|