Inter-rater & Intra-rater Reliability: Types of Reliability

There are four general classes of reliability estimates, each of which estimates reliability in a different way. There are:

I. Inter-Rater or Inter-Observer Reliability

v Used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon.

v Inter rater reliability involves two or more judges or raters, marking the same paper. Scores on a test are independent estimates of these judges or raters. A score is more reliable and accurate measure if two or more raters agree on it. The extent to which the raters agree will determine the level of reliability of the score. In inter-rater reliability, the correlation between the scores of the two judges or raters is calculated.

Intra-rater Reliability

v Used to assess the consistency of results across items within a test.

v While inter-rater reliability involves two or more raters, intra-rater reliability is the consistency of grading by a single rater. Scores on a test are rated by a single rater/judge at different times. When we grade test at different times, we may become inconsistent in our grading for various reasons. Some papers that are graded during the day may get our full and careful attention, while others that are graded towards the end of the day may be very quickly glossed over or marked with paying little attention to it. As such, intra rater reliability determines the consistency of a single teacher’s or rater’s grading of the same papers, at different times.

II. Stability (Test-Retest) Reliability

v Used to assess the consistency of a measure from one time to another.

v In test-retest reliability, the same test is re-administered or re-evaluate to the same people. The scores obtained on the first administration of the test are correlated ot the scores obtained on the second administration of the test. It is expected that the correlation between two scores would be high. However, test-retest would be somehow difficult to be conducted, as it is unlikely for the students to take the same test twice. Memorization and effect of practice will temper with the correlation value.

III. Parallel-Forms Reliability

v Used to assess the consistency of the results of two tests constructed in the same way from the same content domain.

v In this type of reliability, two similar test are administered to the same sample of person or people with the same level of proficiency. Therefore, as in test-retest reliability, two scores are obtained and correlated. However, unlike test-retest, the parallel or equivalent forms reliability measure is protected from the influence of memory/memorizing, as the same questions are not asked in the second of the two tests.

IV. Split Half Reliability

v In this reliability, a test is administered once to a group, is then divided into two equal halves after the students have returned the test, and the halves are then correlated. As the means for determining reliability is internal, within one administration of the test, this method computing reliability is considered as an internal consistency measure. Halves are often determined based on the number assigned to each item with one half consisting of odd numbered items, and the other half even numbered items.

Inter-rater & Intra-rater Reliability

Types of Reliability

No comments:

Post a Comment