Date of Award




Document Type


Degree Name

Doctor of Philosophy (PhD)


School of Criminal Justice

Content Description

1 online resource (xiii, 303 pages)

Dissertation/Thesis Chair

James R Acker

Committee Members

Graeme Newman, Barbara K. Schwartz, Hans Toch, Richard Wiebe


inter-rater reliability, risk assessment, sex offender, Sex offenders, Recidivism

Subject Categories



This research is an experimental simulation that explores the interrater reliability of a multi-factor sex offender risk classification system, specifically, the system outlined by the Massachusetts Sex Offender Registration Act (SORA). In Study 1, professionals holding the terminal degree in their field and qualified for appointment to the Sex Offender Registry Board administered the Massachusetts Classification Worksheet to four sex offender cases. Study 2 involved the use of master's level participants (including professionals with master's degrees and graduate student participants) to administer the Massachusetts Classification Worksheet to the same four sex offender cases. Participants assessed the offenders' risk as either low (Level 1), moderate (Level 2), or high (Level 3). The strength of interreliability of the participants was measured using intra-class correlation coefficients (ICC). Two questionnaires were also administered to all participants to determine, first, the extent to which the 24 factors influenced the affected the rater's classification decision; and second, the rater's perceived level of actual risk of reoffense for each offender. Study 1 results for the classification level seemed excellent (ICC = .8976) until examined in conjunction with the risk range base, which was poor (ICC=.35). These results fail to provide support for use of the worksheet with professionals. Study 2 results for classification level were only fair (ICC = .56) and also failed to provide support for use of the worksheet with "trainee" level staff. Both studies demonstrated the raters' agreement with the SORA and SORB, that certain factors, such as repetitive and compulsive behavior and extravulnerable victims, influenced the classification decision. Although the IRR was not acceptable in most instances (ICCs were below .70), the results from Questionnaire 2 reflected that raters seemed to agree on what constituted low, moderate, and high risk. The studies are the first steps toward the assessment of reliability of these instruments; although the results of these studies do not allow an assessment of the functioning of the real systems themselves, they do suggest that further research of such reliability is warranted in an arena where little is known about the reliability and validity of such instruments.

Included in

Criminology Commons