Skip to main content
Back to Blog

The working teacher's guide to IB English Paper 2 marking criteria 2026

By Steven Swanson, Founder of ClassLens·

If you teach IB English A and you are reading this in May, you are probably sitting with a stack of mocks, a printed copy of the new criteria, and a quiet sense that the old internal calibration you built over five years no longer maps cleanly onto what the IB wants you to score. That is the right read.

If you are teaching at a school in one of the affected Middle East countries (Bahrain, Iran, Iraq, Israel, Jordan, Kuwait, Lebanon, Oman, Palestine, Qatar, Saudi Arabia, or the UAE), your May 2026 looks different from the rest of the cohort. The IB announced on 30 March 2026 that the May 2026 examination session was cancelled in the UAE, and subsequently in Bahrain, with the Non-Exam Contingency Measure (NECM) and other flexibilities (transfer, defer, withdraw with refund) available across the broader affected region on a country-by-country basis as national authorities decide. Where NECM applies, your students' grades are being determined under externally assessed coursework and teacher-predicted grades rather than the timed paper, and your marking workload has shifted accordingly. The criteria changes still affect what "good" looks like in the work you are reading; the timing and stakes are simply different. The rest of this post applies regardless of which path your school is on this cycle.

The 2026 first-assessment cycle is the first one to use the revised Paper 2 criteria. The change is not cosmetic. Criterion A has been reduced from 10 marks to 5. The old single Criterion B for analysis and evaluation has been split into two sub-criteria, B1 (analysis of the individual works) and B2 (comparative analysis across the two works). Total Paper 2 is now scored across five criterion bands of 5 marks each (A, B1, B2, C, D) for a total of 25 marks, per the IBO's Summary of Changes for Teachers, September 2024. The same Paper 2 criteria apply identically to English A: Literature and English A: Language and Literature.

The practical effect is that you can no longer absorb a soft Criterion A score with strong analysis somewhere else. The interpretation work counts for half of what it used to count for, and the comparative work has been pulled out of the analysis bucket and made its own scored line. If you are marking the way you marked last year, the tighter band widths and split B1/B2 line will likely cost your students marks they would have absorbed under the old criteria, sometimes a full band on a single criterion.

This post is the criterion-by-criterion grading guide I wish someone had written for me before I started my own May mocks. It is not a vendor pitch. There is one section near the end where a grading-assistant tool is described, and one section after that on its limits. You can skip both and the rest of the post still earns its keep.

What changed and why

The IB published the revised Language A: Language and Literature subject guide for first teaching September 2024 and first assessment May 2026. The Paper 2 changes are part of a broader push to simplify and standardize the criterion bands across the studies-in-language-and-literature family, and to make the comparative requirement of Paper 2 visible in the rubric instead of buried inside Criterion B.

Three substantive changes are worth naming explicitly.

Criterion A halved. Knowledge, understanding and interpretation is now worth 5 marks, down from 10. The descriptors are tighter. A response that previously earned 7 or 8 out of 10 for solid contextual knowledge plus reasonable interpretation is now competing against responses where the interpretive work is sharper, because the band width is half what it was and the increments matter more. The new Band 5 descriptor for Criterion A raises the bar from thorough demonstration of knowledge to language emphasizing perceptive understanding and persuasive interpretation, which is a meaningfully higher target.

Comparative analysis broken out as B2. Under the old criteria, comparative work was a bullet inside the Analysis and Evaluation band. Markers handled it inconsistently. A response that analyzed both works deeply but never put them in conversation could still attract a band 7 or 8. That is no longer possible. B2 is its own scored line, and the comparative thesis, the comparative evidence, and the comparative reasoning each have to be visible.

B1 retained for individual-work analysis. This is the criterion most experienced markers will calibrate fastest, because it is closest to the old Criterion B. The trap is treating B1 as a 10-mark criterion in your head and double-counting comparative moves under it. Comparative reasoning belongs under B2 now, even when it is interleaved with single-work analysis in the prose.

Criterion C (Focus and organization) and Criterion D (Language) remain at 5 marks each, with the usual editorial tightening to the descriptors.

The IB's stated rationale, in the Sept 2024 Summary, is research-driven streamlining following Paper 2 sittings under the previous criteria, which were first assessed in May 2021. Pulling B2 out of B1 forces the comparative thinking to be load-bearing in the rubric, not decorative.

You have probably read three vendor blog posts this month claiming AI will save you ten hours a week. This is not that post. The change to Paper 2 is, in the first instance, a change to your own marking calibration. No tool fixes that. The point of a guide like this one is to walk through what the new bands actually mean for the work you do with your green pen.

What the change means for your marking workflow

The honest answer is that your average mark per essay will go down, your spread will widen, and your time per essay will go up the first time through. The first effect is not because your students got worse. It is because the band widths are tighter and the comparative line is now scored separately, so weaknesses that used to be absorbed are now visible. The second effect resolves itself by the third or fourth batch, once you have built a new internal anchor for what a band 4 versus a band 5 looks like under the new descriptors.

Three concrete shifts in how you read.

Read Criterion A in one pass, not as a running tally. When Criterion A was worth 10, it made sense to underline knowledge claims throughout the essay and add them up at the end. With 5 marks, the band is decided in the first read of the introduction and the first body paragraph. If the student establishes a perceptive interpretive frame in the opening 200 words, you are looking at a 4 or a 5. If the interpretation is competent but not perceptive, you are looking at a 3. If the interpretation is implicit and you have to reconstruct it, you are at a 2. Spending fifteen minutes reading carefully for Criterion A is now a poor use of your time.

Look for the comparative thesis on your first read. Annotate it in the margin in green ink. If you cannot find a comparative thesis in the first 30 seconds of reading the introduction, the essay is already in the lower B2 band, regardless of how strong the rest is. The new B2 descriptors reward students who frame the comparison upfront and sustain it. They penalize students who introduce two works in parallel and only compare them in the conclusion. A response that "exceeds" under B2 looks like the kind of pairing in active teacher circulation, for example Atwood's The Handmaid's Tale against Ibsen's A Doll's House. A workable thesis on this canonical pairing, drawn from teacher-facing comparative resources, frames it as: "Both writers expose women's subjugation under patriarchal systems through the confined domestic space, but Ibsen frames the response as individual rebellion against bourgeois marriage, while Atwood extends the same logic into systemic dystopian control where rebellion becomes collective and incomplete." That sets up a comparison that has to be sustained across every body paragraph (the ABABAB structure most IB tutors recommend), rather than relegated to the conclusion. A response that "approaches" under B2 mentions the second work in every paragraph but never returns to a unifying comparative argument. (Teacher-facing examples of the Atwood/Ibsen pairing as a circulating Paper 2 thesis are at IB English Tutor HK and RevisionDojo.)

Stop scoring B1 and B2 in the same pass.Pick one. I read through for B1 first (analysis of each work as itself), then a second skim for B2 (the comparative line specifically). It feels slower but it is faster, because you stop second-guessing whether a moment of analysis "counts" comparatively. Marking with two passes through one essay is faster than marking with one indecisive pass.

The other practical implication is that your students need new rubric language internalized. The new Criterion A descriptors emphasize perceptive interpretation, and B2 rewards comparative analysis sustained throughout the essay rather than parked in the conclusion. Students who are still writing to the old descriptors will hedge their interpretations and stack their comparisons at the end. Spend one lesson before the next mock walking through the new rubric language verbatim with them. The vocabulary shift in the rubric is the vocabulary shift you want in their drafts.

A note on time per essay. Expect a noticeable bump on the first batch through, settling back as the new anchor takes hold by the third or fourth batch. The drag is real, but it is front-loaded. The first stack will feel slow because every band decision is being made fresh; by the second weekend of marking, the calibration is back. Recognizing the time cost ahead of the cycle is half the management.

Three habits that hold up under the new criteria

These are the three habits I would commit to before the next mock cycle. None of them require a tool. All three are teacher-to-teacher recommendations, not consultancy.

Annotate the comparison thesis in green ink at first read. This is the single highest-leverage habit because it forces your B2 score early and lets the rest of the read be confirmatory rather than exploratory. If the comparative thesis is fuzzy on first read, leave a margin note ("comparison?") and read on. By the end you will know whether the fuzziness was the introduction's failure or the whole essay's failure, and the B2 score is unambiguous either way.

Build a band-anchor sheet before you start a stack. Pick three essays from your stack, ideally ones you suspect will fall at band 2, band 3, and band 5. Mark those three first, slowly. Write a one-sentence note for each criterion explaining why that essay is at that band. Tape the sheet to your desk and mark the rest of the stack against it. Calibration drift is the largest source of unfairness across a stack of 100, and an anchor sheet is the cheapest fix. Doubly true under the new criteria, because your old anchors are calibrated to the old bands.

Separate the marking pass from the comments pass. Under the new criteria, the marking decision happens fast. The teaching decision (what comments to leave for the student) happens slowly. If you are finding the marking exhausting under the new criteria, try splitting them. Read once for marks. Read again, more selectively, for the two or three margin comments that will actually change what the student does next time. Most of us were trained to do both at once. Under tighter bands and split criteria, the cognitive load of doing both at once is what makes the marking feel exhausting. (Some experienced markers do prefer to mark and comment in a single pass; if that is you, the rest of this guide still applies.)

A fourth habit, optional and only for teachers marking very high volumes (120 plus essays per cycle): take a 90-second break between every essay, and a longer break after every batch of 10. Under the old wider bands, fatigue cost you accuracy you could absorb. Under the tighter 5-mark bands, the same fatigue costs you a full band.

Where ClassLens fits

ClassLens is an AI-assistive grading and teaching tool that drafts grades and feedback against a rubric the teacher configures. For a Paper 2 marking workflow, it slots in after the teacher has already hand-marked the first three or four essays in a stack to build a calibration anchor. Once the anchor is set, the rest of the stack can be uploaded. ClassLens drafts a band score for each criterion and a short justification per submission. The teacher reviews each draft in the Batch Review Dashboard, edits anything they disagree with, and clicks "Return Checked" to release the reviewed grades to students in a single batch. Auto-return was permanently removed from the product on 2026-04-12. There is no mode in which an AI-generated grade reaches a student without teacher review.

For Paper 2 specifically, the teacher pastes the new criteria and their own interpretation of the bands into the rubric configuration, picks a strictness level, and processes a batch. The Batch Review Dashboard also produces a knowledge gap report at the class level, showing which criterion bands are strongest and weakest. Under the new criteria, B2 typically surfaces as the lowest-scoring criterion in the first cycle, which can drive the next mini-lesson on comparative thesis construction. The AI's draft is most often overridden on Criterion A and Criterion B2, where the interpretive frame and the comparative line are the parts a teacher's judgment carries.

On data posture, AI inference runs on Google Cloud Vertex AI under the Google Cloud Data Processing Addendum with Zero Data Retention enrolled, no model training on submissions, and submissions are processed transiently. ClassLens is Google OAuth verified including the restricted Drive scope, CASA Tier 2 complete, SOC 2 Type I attested by Percilchofe CPA LLC (License No. 1188), and a CISA Secure by Design Pledge signatory. International school IT directors will recognize this as the Google Cloud Workspace stack with the standard no-training, no-retention posture. The free tier is 100 submissions per month, enough for a single section's Paper 2 mock cycle. Paid tiers are Pro at $10 per month (1,000 submissions) and Max at $20 per month (5,000 submissions). Sign in at classlens.com or install via the Google Workspace Marketplace.

Disclosure: I am the founder of ClassLens. I am also a working high school teacher, though not at an IB school. Take the prior paragraphs as a tool description, not a peer recommendation. The IB DP English context is not my classroom. If the description sounds useful, test the workflow against your own marked essays before trusting it on a live cycle.

What ClassLens won't do

The honest list. ClassLens drafts grades. It does not replace your judgment on Paper 2, and there are specific places where its draft is reliably worse than yours.

It does not catch the comparative essay that is technically present but argumentatively hollow. AI is reliable on whether two works are mentioned in each paragraph. It is not yet reliable on whether the comparison is doing real interpretive work or just performing comparison. B2 is the criterion where you should expect to override most often.

It does not handle pedagogical context. If a student in your stack has been working all year on getting from band 2 to band 3 in Criterion C, the AI does not know that, and its margin comments will not reflect it. Your handwritten "this is the structural improvement we have been working on, well done" is doing work the AI cannot do.

It does not grade oral commentary. Individual Oral assessment is a conversation, not a text. ClassLens reads text. The IO sits outside what the tool is good for, and I would not try to use it there.

It does not understand your specific class. The knowledge gap report will tell you the class is weak on B2, but it will not tell you that the weakness is because three of your strongest writers were absent for the comparative-thesis lesson and that a one-on-one with them will move the class average more than a whole-class reteach. That is your job, and the part of teaching that does not get automated.

It does not handle high-stakes grading well, and I would not use it for the actual May submissions. ClassLens is for mocks, drafts, formative assignments, and the high-volume rubric work that does not require your final professional judgment. Anything that goes to the IB or onto a transcript should be marked by you, with the AI draft as a second-opinion check at most.

The thing it does well is the mechanical, high-volume part of marking. The thing it does not do is the interpretive, contextual, or relational part. Teachers who try to use it for the second part get burned, and rightly.

Closing

If you are still calibrating, the highest-leverage thing you can do this week is the band-anchor sheet exercise. Pick three essays from a recent batch, mark them slowly against the new criteria, and write a one-sentence justification for each criterion at each band. That sheet will save you more time over the next month than any tool will.

If you have a stack of past Paper 2 mocks in a Google Drive folder and you want to see what an AI-drafted grade looks like under the new criteria before trusting it on a live cycle, that is a clean way to test the workflow. Sign in with any Google account at classlens.com, paste your interpretation of the new bands into the rubric configuration, and grade 100 submissions free. The places where you disagree with the AI are the places where the AI is wrong, and also the places where your professional judgment is most valuable.

If you also teach Paper 1, the companion guide to the May 2026 Paper 1 criteria walks through why Paper 1 did not change this cycle, what the four criteria still reward, and how to keep Paper 2 logic from contaminating Paper 1 marking.

Whether your cohort is sitting the timed paper this May or completing under NECM, the calibration work is the same. The first time through new criteria is hard. It gets easier by the second.

Sources

Steven Swanson is a high school engineering teacher with 22 years of classroom experience and the creator of ClassLens, an AI-powered grading tool built for Google Classroom. Try it free at classlens.com.

Try ClassLens Free

AI-powered grading for Google Classroom. Set up in under five minutes. No credit card required.