An Exploratory Study of Reliability of Ranking vs. Rating in Peer Assessment

Yang Song; Yifan Guo; Edward F. Gehringer

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33122

An Exploratory Study of Reliability of Ranking vs. Rating in Peer Assessment

Authors: Yang Song, Yifan Guo, Edward F. Gehringer

Abstract:

Fifty years of research has found great potential for peer assessment as a pedagogical approach. With peer assessment, not only do students receive more copious assessments; they also learn to become assessors. In recent decades, more educational peer assessments have been facilitated by online systems. Those online systems are designed differently to suit different class settings and student groups, but they basically fall into two categories: rating-based and ranking-based. The rating-based systems ask assessors to rate the artifacts one by one following some review rubrics. The ranking-based systems allow assessors to review a set of artifacts and give a rank for each of them. Though there are different systems and a large number of users of each category, there is no comprehensive comparison on which design leads to higher reliability. In this paper, we designed algorithms to evaluate assessors' reliabilities based on their rating/ranking against the global ranks of the artifacts they have reviewed. These algorithms are suitable for data from both rating-based and ranking-based peer assessment systems. The experiments were done based on more than 15,000 peer assessments from multiple peer assessment systems. We found that the assessors in ranking-based peer assessments are at least 10% more reliable than the assessors in rating-based peer assessments. Further analysis also demonstrated that the assessors in ranking-based assessments tend to assess the more differentiable artifacts correctly, but there is no such pattern for rating-based assessors.

Keywords: Peer assessment, peer rating, peer ranking, reliability.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1132399

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1120

References:

[1] S. M. Brookhart, The Art and Science of Classroom Assessment. The Missing Part of Pedagogy. ASHE-ERIC Higher Education Report, Volume 27, Number 1. ERIC Clearinghouse on Higher Education, One Dupont Circle, Suite 630, Washington, DC 20036-1183 ($24)., 1999.
[2] K. Topping, “Peer Assessment Between Students in Colleges and Universities,” Rev. Educ. Res., vol. 68, no. 3, pp. 249–276, Sep. 1998.
[3] F. Dochy, M. Segers, and D. Sluijsmans, “The use of self-, peer and co-assessment in higher education: A review,” Stud. High. Educ., vol. 24, no. 3, pp. 331–350, Jan. 1999.
[4] D. Babik, E. F. Gehringer, J. Kidd, P. Ferry, and T. David, “Probing the Landscape: A Systematic Meta-review of Online Peer Assessment Systems in Education,” in CSPRED 2016: Workshop on Computer-Supported Peer Review in Education, 9th International Conference on Educational Data Mining (EDM 2016), Raleigh, N.C, 2016.
[5] E. Gehringer, “Expertiza: information management for collaborative learning,” Monit. Assess. Online Collab. Environ. Emergent Comput. Technol. E-Learn. Support, pp. 143–159, 2009.
[6] “Mobius SLIP: UNCG develops a new online learning tool,” Research & Economic Development, 06-Nov-2013. (Online). Available: http://research.uncg.edu/spotlight/mobius-slip-uncg-develops-a-new-online-learning-tool/. (Accessed: 08-Jul-2016).
[7] L. De Alfaro and M. Shavlovsky, “CrowdGrader: A Tool for Crowdsourcing the Evaluation of Homework Assignments,” in Proceedings of the 45th ACM Technical Symposium on Computer Science Education, New York, NY, USA, 2014, pp. 415–420.
[8] D. Tinapple, L. Olson, and J. Sadauskas, “CritViz: Web-based software supporting peer critique in large creative classrooms,” Bull. Tech. Comm. Learn. Technol., vol. 15, no. 1, 2013.
[9] Y. Song, Z. Hu, and E. F. Gehringer, “Closing the Circle: Use of Students’ Responses for Peer-Assessment Rubric Improvement,” in Advances in Web-Based Learning -- ICWL 2015, F. W. B. Li, R. Klamma, M. Laanpere, J. Zhang, B. F. Manjón, and R. W. H. Lau, Eds. Springer International Publishing, 2015, pp. 27–36.
[10] Y. Song, F. Pramudianto, and E. F. Gehringer, “A markup language for building a data warehouse for educational peer-assessment research,” in 2016 IEEE Frontiers in Education Conference (FIE), 2016, pp. 1–5.
[11] J. S. Kane and E. E. Lawler, “Methods of peer assessment,” Psychol. Bull., vol. 85, no. 3, pp. 555–586, 1978.
[12] M. van Zundert, D. Sluijsmans, and J. van Merriënboer, “Effective peer assessment processes: Research findings and future directions,” Learn. Instr., vol. 20, no. 4, pp. 270–279, Aug. 2010.
[13] J. Hamer, K. T. K. Ma, and H. H. F. Kwong, “A Method of Automatic Grade Calibration in Peer Assessment,” in Proceedings of the 7th Australasian Conference on Computing Education - Volume 42, Darlinghurst, Australia, Australia, 2005, pp. 67–72.
[14] Y. Song, Z. Hu, Y. Guo, and E. F. Gehringer, “An experiment with separate formative and summative rubrics in educational peer assessment,” in 2016 IEEE Frontiers in Education Conference (FIE), 2016, pp. 1–7.
[15] Z. Hu, Y. Song, and E. Gehringer, “The Role of Initial Input in Reputation Systems to Generate Accurate Aggregated Grades from Peer Assessment - Semantic Scholar,” in CSPRED 2016: Workshop on Computer-Supported Peer Review in Education, 9th International Conference on Educational Data Mining (EDM 2016), 2016.
[16] Y. Song, E. F. Gehringer, J. Morris, J. Kid, and S. Ringleb, “Toward Better Training in Peer Assessment: Does Calibration Help?,” in Computer-Supported Peer Review in Education (CSPRED-2016), 2016.
[17] P. Denny, A. Luxton-Reilly, and J. Hamer, “The PeerWise System of Student Contributed Assessment Questions,” in Proceedings of the Tenth Conference on Australasian Computing Education - Volume 78, Darlinghurst, Australia, Australia, 2008, pp. 69–74.
[18] F. Pramudianto et al., “Peer Review Data Warehouse: Insights From Different Systems,” in CSPRED 2016: Workshop on Computer-Supported Peer Review in Education, 9th International Conference on Educational Data Mining (EDM 2016), 2016.
[19] Shah, N. B., Bradley, J. K., Parekh, A., Wainwright, M., & Ramchandran, K. (2013, December). A case for ordinal peer-evaluation in MOOCs. In NIPS Workshop on Data Driven Education.
[20] Y. Song, Z. Hu, and G. Gehringer, “Who Took Peer Review Seriously: Another Perspective on Student-Generated Quizzes,” in CSPRED 2016: Workshop on Computer-Supported Peer Review in Education, 9th International Conference on Educational Data Mining (EDM 2016), Raleigh, N.C, 2016.
[21] Y. Song, Z. Hu, and E. F. Gehringer, “Pluggable reputation systems for peer review: A web-service approach,” in IEEE Frontiers in Education Conference (FIE), 2015. 32614 2015, 2015, pp. 1–5.