Detecting AI Writing Is Difficult: A Faculty Guide to Evidence, Conversation, and Fair Process

Body

Generative AI has changed how students draft, revise, brainstorm, summarize, and polish their work. It has also made academic integrity conversations more complicated. While faculty may notice writing that feels unusual, overly polished, generic, or inconsistent with a student’s prior work, it is important to remember that detecting AI-generated writing is difficult, and no AI detector should be treated as definitive proof of misconduct.

Why AI Detection Tools Should be Used Cautiously

Tools such as Turnitin’s AI indicator, GPTZero, Copyleaks, Originality.ai, Winston AI, and others may provide a useful starting point for review, but they are not reliable enough to stand alone as evidence. A major study of AI-text detection tools, including commercial systems used in academic settings, concluded that available tools were “neither accurate nor reliable” and that obfuscation or paraphrasing significantly worsened performance. (arXiv) Other research has found that detectors may be biased against non-native English writers, increasing the risk that multilingual students or students with more formulaic academic prose could be unfairly flagged. (arXiv)

Turnitin’s AI detector is widely available and familiar to many instructors, but it should be treated as an indicator, not a verdict. Turnitin has reported low false-positive rates in some contexts, but reporting and independent accounts continue to document concerns about false positives, false negatives, bias, and the need to use the tool as a prompt for further inquiry rather than as the sole basis for academic action. (WIRED)

Classic Clues Still Matter

When faculty are concerned that submitted work may not reflect a student’s own learning, the strongest starting point is often not an AI detector. It is the instructor’s professional judgment combined with multiple points of evidence.

Helpful clues may include:

  1. Comparison to previously submitted work
    Look for significant shifts in vocabulary, sentence structure, tone, citation habits, complexity, organization, or disciplinary understanding. A sudden change does not prove misconduct, but it may justify a conversation.

  2. Mismatch with the assignment prompt
    AI-generated work may be polished but vague, may avoid the specific requirements of the prompt, or may fail to engage course materials, class discussion, required readings, local examples, or discipline-specific expectations.

  3. Inaccurate or fabricated sources
    Generative AI can invent citations, misrepresent sources, or include references that appear scholarly but do not exist.

  4. Lack of process evidence
    Consider whether the student can provide outlines, notes, drafts, revision history, annotated sources, lab notes, data files, or other artifacts that show how the work developed.

  5. Inability to explain the work
    A student who completed the work should usually be able to explain the main argument, define key terms, discuss sources, describe their process, and answer questions about choices they made.

  6. Unusual formatting or phrasing patterns
    Watch for generic transitions, repetitive structure, unsupported claims, oddly balanced paragraphs, fabricated quotations, or language that sounds authoritative but does not reflect the student’s demonstrated level of understanding.

None of these clues is proof by itself. Together, however, they can help faculty decide whether a supportive inquiry is appropriate.

Common AI Detectors

The following tools are among the more commonly referenced AI-writing detectors. They are listed here as possible reference points, not as endorsed proof tools.

Tool Useful for Caution
Turnitin AI Writing Indicator Integrated academic workflow; highlights portions of text that may be AI-generated Should not be used as the sole basis for an academic integrity referral
GPTZero Education-focused AI-writing review; sentence-level feedback May produce false positives and should be corroborated
Copyleaks AI Detector Institutional and multilingual detection options Vendor claims should be interpreted cautiously and tested locally
Originality.ai Often used in publishing and web-content contexts Not designed specifically around Chatham course context or student process
Winston AI Document-based reports and readability-style outputs Still subject to the same limits as other probabilistic detectors
Scribbr / QuillBot AI Detector Quick informal checks Should be used only as a low-stakes signal, not evidence

The “best” detector is not the one with the highest confidence score. The best use of any detector is limited, careful, and contextual: one data point among several.

Recommended Process Before Academic Integrity Referral

Before referring a case to the Academic Integrity Office, faculty should gather multiple points of reference and, whenever appropriate, speak with the student. A fair process might look like this:

  1. Review the assignment expectations
    Confirm whether your syllabus, assignment sheet, or course policy clearly explained what AI use was allowed, prohibited, or required to be disclosed.

  2. Compare the work to prior student submissions
    Look at earlier assignments, discussion posts, drafts, emails, quizzes, in-class writing, or other examples of the student’s writing and thinking.

  3. Verify citations and claims
    Check whether sources exist, whether they say what the paper claims, and whether quoted or paraphrased material is accurate.

  4. Review process evidence
    Ask for drafts, outlines, notes, revision history, research logs, or document history when available.

  5. Use AI detectors only as one reference point
    A Turnitin AI score or another detector result may raise a question, but it should not be the only point of reference.

  6. Meet with the student
    Approach the conversation as an inquiry, not an accusation. Ask the student to explain their process, sources, choices, and understanding of the work.

  7. Document what you reviewed
    Keep notes on the assignment policy, prior work comparison, detector results if used, source checks, student conversation, and any process materials reviewed.

  8. Refer only when the concern remains evidence-based
    If, after reviewing multiple points of evidence and speaking with the student, there is still a reasonable concern that the work violates course policy or the Honor Code, then referral to the Academic Integrity Office is more appropriate.

Bottom Line

AI detectors can be helpful signals, but they are not proof. Because all AI-writing detectors are imperfect, faculty should avoid relying on any single tool, including Turnitin’s AI detector, to make academic integrity decisions. The strongest practice is to combine classic instructional evidence, prior student work, citation review, process documentation, detector results when appropriate, and a direct conversation with the student before making a referral.

Details

Details

Article ID: 28783
Created
Tue 5/19/26 1:37 PM
Modified
Tue 5/19/26 2:04 PM