Integrating AI Into Assessments: From Policy to Practice

A person engaging with AI-generated text content.

Conventional assessments, such as essays and multiple-choice questions, have long been the cornerstone of evaluating student performance. However, the widespread availability of generative AI (genAI) tools necessitates rethinking assessment methods. Now that genAI tools are readily accessible and rapidly improving, it is crucial to develop assessment approaches that maintain academic integrity while leveraging the benefits of AI to engage students and prepare them for the modern workforce (Yu, 2023).

This piece provides actionable strategies for incorporating AI into assessments, aligning the level of genAI usage based on learning goals while helping students develop AI literacy. These strategies aim to prepare students not only to use AI effectively but also to evaluate its limitations.

Scaffolding AI Integration

Educators must first determine their institution's policies regarding AI use in teaching and assessment. If AI integration is permitted, the next step is to examine course-level learning objectives. Where these objectives can be modified, consider adding outcomes that explicitly address AI literacy and ethical tool use in your discipline. If course objectives cannot be changed, instructors can examine module-specific objectives to identify opportunities for AI integration that support existing course outcomes. Educators should then determine what they want students to demonstrate in each assignment—such as foundational skills, problem-solving, creativity, or critical analysis—and consider how AI can meaningfully support these goals (Markauskaite et al., 2022). Aligning desired outcomes with the appropriate level of AI integration will help ensure that AI is used to enhance learning rather than detract from it (Su & Yang, 2023).

While redesigning assessments can be challenging and time-consuming (Cram et al., 2022), existing frameworks can help guide this process without necessitating a complete overhaul of current practices. The AI Assessment Scale (AIAS) provides a scaffolded approach to assessment design in the age of AI (Perkins et al., 2024). It is designed to be flexible and adaptable across disciplines, supporting students in understanding how these tools can be effectively and ethically used. The following categorizations can also be useful when auditing existing assignments to consider how they may need to be adjusted in light of AI tools. As laid out in the AIAS, AI can be integrated into existing or new assessments at varying levels:

  • No AI: Assessments are completed entirely without AI assistance, ensuring students rely solely on their knowledge and skills. This level is suitable for supervised or low-stakes formative assessments, as well as tasks requiring personal skills or knowledge demonstration.
  • AI-assisted idea generation and structuring: AI can be used for brainstorming, creating structures, and generating ideas, but AI content is not allowed in the final submission. This level is useful for idea development and research assistance.
  • AI-assisted editing: Students can use AI to improve the clarity or quality of their work, but they must provide their original work in an appendix. This approach can particularly help non-native speakers and those with language difficulties. In multimedia assessments, genAI tools might be allowed for editing images or videos, but not for creating them from scratch.
  • AI task completion, human evaluation: AI is used to complete certain elements of the task, with students providing discussion or commentary on the AI-generated content. This level encourages critical engagement with AI-generated content and evaluation of its output.
  • Full AI: AI is used throughout the assessment, allowing for a collaborative approach and enhancing creativity. Students may use AI without specifying which content is AI-generated.

You may choose to specify different levels of AI integration across assignments or to adopt a consistent AI policy in your course. For instance, a liberal approach might encourage AI tools as learning aids and research assistants, allowing students to use AI for brainstorming, drafting, and editing while requiring disclosure of AI use and critical engagement with AI-generated content. A moderate approach could permit AI use for specific, instructor-approved tasks such as grammar checking or preliminary research, with students required to provide prompts used or complete chat logs. In this case, critical assignments and exams would typically be completed without AI assistance. Alternatively, a conservative approach might prohibit the use of AI tools entirely, treating any unauthorized use as academic dishonesty. Regardless of the chosen approach, it's essential to provide clear guidance on appropriate AI use for each assignment and to ensure that students understand the rationale behind the policy.

It can also be valuable to provide real-world examples where students cannot or should not use AI, even though AI might be capable of completing the task. For example, if preparing students to succeed at closed-book professional examinations, make sure that your assignments help them learn the necessary skills and knowledge without becoming reliant on AI. If teaching future therapists, consider the downsides of having students rely on AI tools to generate a clinical diagnosis. For instance, would they be able to interpret whether the diagnosis made sense or have the tools to explain the rationale to the client? What risks or ethical considerations should be considered when using AI with client histories or protected health information?

In determining the appropriate level of AI integration, it can also be useful to reflect on your core values as an educator (De Gagne, 2023; Gamage et al., 2021). For instance, if you highly value developing students’ metacognition, consider asking students to reflect on their process for completing coursework, including how they did or did not use AI tools. If you aim to build cultural competence and global citizenship, consider incorporating discussions on biases in AI models and potential voices that are not represented in AI output. If your main goal is to develop creative, critical thinkers, consider how you can help students engage in original thinking and evaluation with or without AI involvement.

As you update your assessments, be sure that associated grading rubrics explicitly highlight the human competencies you are trying to develop. For example, this sample rubric for AI-enhanced work (National Institute on Artificial Intelligence in Society, 2024) emphasizes originality, authenticity, critical thinking, discerning thinking, and integration of course concepts. It can also be a useful model for evaluating assignments that students are expected to complete without AI assistance, as it requires that text reflects a writer’s voice, distinguishable from AI-generated content.

To encourage critical thinking, consider using this critical reflection learning tool (Cubero, 2024), which asks students to self-assess their use of genAI in assignments using the substitution augmentation modification redefinition (SAMR) model (Hamilton et al., 2016; Puentedura, 2006). The SAMR model considers how technology impacts teaching and learning through each of the following four levels: substitution, augmentation, modification, and redefinition. This tool can be used to supplement a course instructor’s current assessments; thereby, not requiring a re-design of existing courses (McDermott & Anselmo, 2024).

Integrating AI Into Authentic Assessments

Authentic activities have become increasingly crucial in the age of genAI, offering valuable opportunities to motivate students and evaluate their real-world competencies (Matheis & John, 2024; Rudolph et al., 2023; Saher et al., 2022; Sotiriadou et al., 2020). As AI increasingly automates basic tasks, it is essential to design assignments that develop students' higher-order thinking skills, particularly focusing on the highest levels of Bloom's taxonomy that challenge students to innovate and create with AI assistance, rather than just recall information (Crawford et al., 2023; Kim et al., 2019; Thanh et al., 2023). For example, an authentic assessment in a business course could require students to develop a comprehensive business plan using AI tools for market research analysis, financial modeling, and competitive analysis, while critically evaluating and refining the AI-generated insights. This approach helps students understand both the capabilities and limitations of AI in real-world business contexts.

Interactive oral assessments (IOAs) represent an innovative approach to authentic assessment that aligns well with the demands of an AI-mediated world. IOAs involve unscripted conversations between an assessor and student framed around workplace scenarios, providing a more authentic evaluation of a student's ability to apply knowledge in real-world contexts (Sotiriadou et al., 2020). This method not only helps prevent academic misconduct but also aids in developing students' professional identity and communication skills. IOAs can incorporate customized AI personas to guide students through conversations, evaluating learners' critical thinking and ensuring appropriate understanding has been achieved. Students can then reflect on their interaction with the AI system, analyzing its strengths and limitations in supporting their learning.

We need to move beyond viewing AI tools as mere agents-to-write or agents-to-answer questions, and instead see them as agents-to-support experiential learning for authentic assessment (Salinas-Navarro et al., 2024). Allowing or even requiring students to use genAI at various stages of the assessment process enhances authenticity by mirroring real-world professional practices (Moorhouse et al., 2023).

Here are some specific examples of authentic assessments with AI integration opportunities:

  • AI-augmented projects where students use genAI as they would in their current or future jobs to solve problems (Holmes & Miao, 2023; Sullivan et al., 2023; Wu & Chang, 2023)
  • Role-playing exercises with AI systems that simulate industry scenarios, such as through game-based learning experiences, branching scenarios, or real-time conversations with AI avatars (Carlson, 2024)
  • Project-based learning where AI serves as a research assistant and creative collaborator (Boughattas et al., 2024; Ogunleye et al., 2024)
  • Partnering with real organizations to provide internships or service learning opportunities in which students work to address immediate problems leveraging genAI technologies, using challenge-based (Leijon et al., 2022; Gallagher & Savage, 2023), inquiry-based (Friesen & Scott, 2013; Pedaste et al., 2015), or problem-based learning frameworks
  • Portfolio assessments that demonstrate students' ability to effectively leverage AI tools while maintaining critical thinking and originality (Moorhouse et al., 2023)

When designing AI-enhanced authentic assessments, several key principles should guide the process. First, tasks should reflect realistic professional scenarios where AI is actually used in the field, ensuring students gain practical experience with relevant tools and workflows. Critical evaluation skills must be developed, enabling students to effectively assess AI outputs and make informed decisions about their application. The assessment should also build AI literacy by helping students understand both the capabilities and limitations of AI tools, along with their ethical implications. Finally, assessments should help students develop their professional identity by recognizing their unique value proposition in an AI-augmented workplace, and understanding how their human skills complement and extend beyond AI capabilities.

Conclusion

By thoughtfully integrating genAI tools into assessments, educators can better prepare students to navigate the complexities of AI-enhanced workplaces with critical thinking, creativity, and ethical judgment. As AI technology continues to advance, ongoing reflection and adaptation of assessment practices will be crucial to ensure their effectiveness and relevance.

References

Boughattas, N., Neji, W., & Ziadi, F. (2024). Project based assessment in the era of generative AI: Challenges and opportunities. Proceedings of the 20th International CDIO Conference.

Carlson, C. (2024, November). Social-emotional, AI-powered avatar simulations: Improving communication and building empathy for all! [PowerPoint slides]. University of Colorado Anschutz Medical Campus.

Cram, A., Harris, L., Raduescu, C., Huber, E., Zeivots, S., Brodzeli, A., Wright, S., & White, A. (2022). Online assessment in Australian University Business Schools: A snapshot of usage and challenges. ASCILITE Publications.

Crawford, J., Cowling, M., & Allen, K.-A. (2023). Leadership is needed for ethical ChatGPT: Character, assessment, and learning using artificial intelligence (AI). Journal of University Teaching and Learning Practice, 20(3).

Cubero, V. (2024). Can I use AI on this assignment? Generative AI acceptable use scale. Canva.

De Gagne, J. C. (2023). Values clarification exercises to prepare nursing students for artificial intelligence integration. International Journal of Environmental Research and Public Health, 20(14), Article 6409.

Friesen, S., & Scott, D. (2013). Inquiry-based learning: A review of the research literature. Alberta Ministry of Education.

Gallagher, S. E., & Savage, T. (2020). Challenge-based learning in higher education: An exploratory literature review. Teaching in Higher Education, 28(6), 1135–1157.

Gamage, K. A. A., Dehideniya, D. M. S. C. P. K., & Ekanayake, S. Y. (2021). The role of personal values in learning approaches and student achievements. Behavioral Sciences, 11(7), Article 102.

Hamilton, E. R., Rosenberg, J. M., & Akcaoglu, M. (2016). The substitution augmentation modification redefinition (SAMR) model: A critical review and suggestions for its use. TechTrends, 60, 433–441.

Holmes, W., & Miao, F. (2023). Guidance for generative AI in education and research. UNESCO.

Kim, S., Raza, M., & Seidman, E. (2019). Improving 21st-century teaching skills: The key to effective 21st-century learners. Research in Comparative and International Education, 14(1), 99–117.

Leijon, M., Gudmundsson, P., Staaf, P., & Christersson, C. (2022). Challenge based learning in higher education: A systematic literature review. Innovations in Education and Teaching International, 59(5), 609–618.

Markauskaite, L., Marrone, R., Poquet, O., Knight, S., Martinez-Maldonado, R., Howard, S., Tondeur, J., De Laat, M., Buckingham Shum, S., Gašević, D., & Siemens, G. (2022). Rethinking the entwinement between artificial intelligence and human learning: What capabilities do learners need for a world with AI? Computers and Education: Artificial Intelligence, 3, Article 100056.

Matheis, P., & John, J. J. (2024). Reframing assessments: Designing authentic assessments in the age of generative AI. S. Mahmud (Ed.), Academic Integrity in the Age of Artificial Intelligence (pp. 139–161). IGI Global Scientific Publishing.

McDermott, B., & Anselmo, L. (2024, May 21). Academic integrity reflection tool: Responsible GenAI use for students in assessments. Taylor Institute for Teaching and Learning, University of Calgary.

Moorhouse, B. L., Yeo, M. A., & Wan, Y. (2023). Generative AI tools and assessment: Guidelines of the world's top-ranking universities. Computers and Education Open, 5, Article 100151.

National Institute on Artificial Intelligence in Society. (2024, May 16). Sample cheating-resistant evaluation rubric. California State University, Sacramento.

Ogunleye, B., Zakariyyah, K. I., Ajao, O., Olayinka, O., & Sharma, H. (2024). Higher education assessment practice in the era of generative AI tools. Journal of Applied Learning & Teaching, 7(1), 46–56.

Pedaste, M., Mäeots, M., Siiman, L. A., de Jong, T., van Riesen, S. A. N., Kamp, E. T., Manoli, C. C., Zacharia, Z. C., & Tsourlidaki, E. (2015). Phases of inquiry-based learning: Definitions and the inquiry cycle. Educational Research Review, 14, 47–61.

Perkins, M., Furze, L., Roe, J., & MacVaugh, J. (2024). The Artificial Intelligence Assessment Scale (AIAS): A framework for ethical integration of generative AI in educational assessment. Journal of University Teaching and Learning Practice, 21(6).

Puentedura, R. R. (2006). Transformation, technology, and education [Presentation]. Strengthening Your District Through Technology, Maine, United States.

Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning &Teaching, 6(1), 342–363.

Saher, A.-S., Ali, A. M. J., Amani, D., & Najwan, F. (2022). Traditional versus authentic assessments in higher education. Pegem Journal of Education and Instruction, 12(1), 283–291.

Salinas-Navarro, D. E., Vilalta-Perdomo, E., Michel-Villarreal, R., & Montesinos, L. (2024). Using generative artificial intelligence tools to explain and enhance experiential learning for authentic assessment. Education Sciences, 14(1), Article 83.

Sotiriadou, P., Logan, D., Daly, A., & Guest, R. (2020). The role of authentic assessment to preserve academic integrity and promote skill development and employability. Studies in Higher Education, 45(11), 2132–2148.

Su, J., & Yang, W. (2023). Unlocking the power of ChatGPT: A framework for applying generative AI in education. ECNU Review of Education, 6(3), 355–366.

Sullivan, M., Kelly, A., & McLaughlan, P. (2023). ChatGPT in higher education: Considerations for academic integrity and student learning. Journal of Applied Learning & Teaching, 6(1), 31–40.

Thanh, B. N., Vo, D. T. H., Nhat, M. N., Pham, T. T. T., Trung, H. T., & Xuan, S. H. (2023). Race with the machines: Assessing the capability of generative AI in solving authentic assessments. Australasian Journal of Educational Technology, 39(5), 59–81.

Wu, T., & Chang, M. (2023). Application of generative artificial intelligence to assessment and curriculum design for project-based learning. 2023 International Conference on Engineering and Emerging Technologies (ICEET).

Yu, H. (2023). Reflection on whether Chat GPT should be banned by academia from the perspective of education and teaching. Frontiers in Psychology, 14, Article 1181712.