ExecExam: A Tool to Facilitate Effective Executable Examinations in Python

Pallas-Athena Cain, Hemani Alaparthi, and Gregory Kapfhammer

May 15, 2025

What is an Executable Examination?

Goal: Assess a student’s ability to program with real tools

  • A student writes, modifies, and runs code to solve a real problem
  • Graded via automated tests that use Pytest tests and assertions
  • Unlike static examinations an executable examination assesses:
    • Programming logic
    • Debugging ability
    • Tool use (e.g., text editor, terminal, IDE, and Git)

🎯 Like a take-home project — but precise, consistent, and scalable!

Reference: Chris Bourke, Yael Erez, and Orit Hazzan. 2023. “Executable Exams: Taxonomy, Implementation and Prospects”. In Proceedings of 54th SIGCSE.

Problems with Computing Assessments

Why do we need better assessments?

  • Manual grading is slow and inconsistent
  • Students often don’t know why their code fails
  • Feedback is shallow or missing altogether
  • Limited assessment of effective tool use
  • Pytest not a good fit for assessment

🚫 Test assertion failure is not enough! ExecExam is a compelling alternative to either manual assessment or running only Pytest.

What is ExecExam?

Scalable, feedback-rich assessment tool built in Python

  • Runs Pytest tests on student code
  • Reports all test failures and context
  • Clearly explains why a test failed
  • Suggests how to fix tested function
  • Uses LLMs for enhanced feedback

Next Step: Explore ExecExam’s features and how teachers can integrate them into the assessments for their programming courses!

Understanding ExecExam’s Output

Key Features of ExecExam

Why use ExecExam for your next assessment?

  • 🧪 Configured Pytest runs for streamlined assessment
  • 💻 Runs on student laptop through assessment process
  • 📜 Provides contextualized, detailed test failure reports
  • ⚙️ Integrates with GitHub and GitHub Actions for CI/CD
  • 🧠 Features flexible, democratized LLM-powered debugging
  • 🔁 Offers actionable insights to instructors and students!
  • 🛠️ Open-source tool collaboratively developed on GitHub

Getting Started with ExecExam

How instructors can adopt automated assessments

  • Create a solution repository
    • Design scaffolded coding tasks
    • Write test cases using Pytest
    • Add ExecExam as a dependency
    • Use GatorGrader to run all checks
  • Using solution ablation to create a starter repository
  • GitHub Classroom distributes and receives examinations

Conclusions and Future Work

  • 📊 Analytics and Instructor Features
    • Store test outcomes and feedback over time
    • Visualize student debugging and improvement paths
    • Log LLM interactions to evaluate effectiveness
    • Hold out hidden test cases for instructor-only grading
  • 🧠 Adaptive Feedback Loops
    • Tailor feedback complexity to student performance
    • Allow students to rate different types of LLM feedback