There
are four general classes of reliability estimates, each of
which estimates reliability in a different way. There are:
I.
Inter-Rater
or Inter-Observer Reliability
v Used
to assess the degree to which different raters/observers give consistent estimates
of the same phenomenon.
v Inter
rater reliability involves two or more judges or raters, marking the same
paper. Scores on a test are independent estimates of these judges or raters. A
score is more reliable and accurate measure if two or more raters agree on it.
The extent to which the raters agree will determine the level of reliability of
the score. In inter-rater reliability, the correlation between the scores of
the two judges or raters is calculated.
Intra-rater Reliability
v Used
to assess the consistency of results across items within a test.
v While
inter-rater reliability involves two or more raters, intra-rater reliability is
the consistency of grading by a single rater. Scores on a test are rated by a
single rater/judge at different times. When we grade test at different times,
we may become inconsistent in our grading for various reasons. Some papers that
are graded during the day may get our full and careful attention, while others
that are graded towards the end of the day may be very quickly glossed over or
marked with paying little attention to it. As such, intra rater reliability
determines the consistency of a single teacher’s or rater’s grading of the same
papers, at different times.
II.
Stability
(Test-Retest) Reliability
v Used
to assess the consistency of a measure from one time to another.
v In
test-retest reliability, the same test is re-administered or re-evaluate to the
same people. The scores obtained on the first administration of the test are
correlated ot the scores obtained on the second administration of the test. It
is expected that the correlation between two scores would be high. However,
test-retest would be somehow difficult to be conducted, as it is unlikely for
the students to take the same test twice. Memorization and effect of practice
will temper with the correlation value.
III.
Parallel-Forms
Reliability
v Used
to assess the consistency of the results of two tests constructed in the same
way from the same content domain.
v In
this type of reliability, two similar test are administered to the same sample
of person or people with the same level of proficiency. Therefore, as in
test-retest reliability, two scores are obtained and correlated. However,
unlike test-retest, the parallel or equivalent forms reliability measure is
protected from the influence of memory/memorizing, as the same questions are
not asked in the second of the two tests.
IV.
Split
Half Reliability
v In
this reliability, a test is administered once to a group, is then divided into
two equal halves after the students have returned the test, and the halves are
then correlated. As the means for determining reliability is internal, within
one administration of the test, this method computing reliability is considered
as an internal consistency measure. Halves are often determined based on the
number assigned to each item with one half consisting of odd numbered items,
and the other half even numbered items.
No comments:
Post a Comment