Bassam Ismail
Building Press
Building Press·Part 4 of 6
Engineering

Building Press, Part 4: Review is where the human gates the irreversible

7 min read

For Press, human in the loop publishing is the boundary between AI-assisted drafting and the irreversible act of sending work into the world. I write Press, an editorial engine that drafts from my actual work and walks each piece to a published post. Part 1 covered ingestion, where redaction runs in code before anything leaves my machine. This part is about the other end of the pipe, the moment that decides whether the whole system is safe to run unattended: the gap between a finished draft and a live post.

The draft that made me take this seriously read beautifully. It was technically sharp, well-structured, and it casually named a client in a sentence the deterministic redactor had not caught because the name appeared in a form the glossary did not list. Nothing in the draft looked wrong. That is the problem with trusting output quality as your safety check. A good draft and a safe draft are different properties, and the model that produces the first has no reliable view of the second.

So Press is built on a narrow thesis: AI-assisted content is only safe if the irreversible, outward-facing steps are gated by a human or by deterministic code, never by the model's confidence. Publishing, posting to a platform, and deleting are the steps you cannot take back. Everything before them can be as wrong as it likes, as often as it likes, because it sits behind a gate the model cannot open on its own.

Human in the loop publishing means nothing publishes itself

Between a draft and a live post sit three gates, and they do not all work the same way.

The first is a confidentiality deep-scan. The deterministic redactor from ingestion already ran at egress, replacing known secrets and glossary terms by exact match. The deep-scan is the second layer: an LLM pass that reads the draft looking for the things exact-match cannot catch, like a client described rather than named, or a detail that is individually harmless but identifying in combination. It exists precisely because the first layer is fast and certain but literal.

The second gate is a set of SEO and answerability checks. Does the piece actually answer the question it claims to? Is there a focus keyphrase, a title, a meta description? These are quality and reach concerns, not safety, but they live at the same boundary because the same principle applies: catch it before it ships, not after.

The third gate is a human approval. Me, reading the thing, clicking publish.

Two kinds of gate

The distinction that makes this work is between gates that catch a deterministic leak and gates that are a judgment call.

A deterministic gate blocks hard. If the confidentiality scan finds a literal secret or a denylisted term still present in the body, you cannot wave it through. There is no "publish anyway" button for that class of finding, because the cost of a false negative is unbounded and the check itself is certain: the term is either present or it is not. A hard block sends the post back to draft. You fix it, or it does not ship.

A judgment gate flags for a person. The deep-scan's "this paragraph might identify a client" is a probability, not a fact. The SEO checks are advice. These surface to me with context, and I decide. The system's job there is not to be right; it is to make the decision easy and to make sure the decision gets made by something that can be held responsible.

Important

Gate the irreversible step, not the model's output. A confidence score on a draft tells you how sure the model is, which is not the same as how safe the post is. Put the hard checks and the human on the act of publishing, and the quality of any individual draft stops being a safety question.

The state machine

Every post moves through an explicit set of states, and the transitions are where the gates live. A post is born a draft. Sending it to review runs the gates. If they pass and I approve, it becomes approved, and from there it either publishes now or waits for a scheduled time. A blocked gate routes it straight back to draft with the reason attached. That same bias toward explicit structure shows up in Building Press, Part 2: The data model is the spine, because the data model has to make the allowed paths hard to misread.

The shape matters more than it looks. There is no edge from draft to published. You cannot skip review, because the only transition into published comes from approved, and the only way to reach approved is through in_review with the gates satisfied and a human's yes. The irreversible state has exactly one narrow doorway, and that doorway is the gate. That is the point of human in the loop publishing: the machine can prepare the work, but it cannot cross the boundary by itself.

Deep-dive: deterministic blocks vs judgment flags

The implementation keeps these as two separate result types so they cannot be confused at the call site. A deterministic finding carries enough to identify the exact offending term and its location; the publish action checks for any such finding and refuses the transition if one exists, full stop, before a human is even consulted. A judgment finding carries a description and a severity, and it is rendered into the review surface as something to read, not something that blocks the button.

The reason for the split is that the two failure modes have opposite costs. For a literal secret, a false negative is catastrophic and a false positive is cheap (you reword one sentence), so you bias hard toward blocking. For a "this might be too promotional" note, a false positive is expensive (it nags you on every post until you stop reading the notes) so you bias toward advising. Mixing them, treating a judgment call as a hard block or a hard block as advice, breaks both: you either cannot publish anything or you train yourself to click through warnings that sometimes matter.

What I rejected

The tempting shortcut was to let the model grade itself. Ask the drafting model for a confidence score, or run a second model as a judge and auto-publish above some threshold. I built a version of the judge and kept it, but only as an advisory signal inside review, never as a gate. The moment a model's self-assessment can open the door to publishing, you have moved the irreversible decision back inside the model, which is exactly where the thesis says it must not live. A fluent model is most confident precisely when it is fluently wrong.

The other rejected option was making review purely advisory: surface everything, block nothing, trust the operator. That fails the opposite way. The deterministic leaks are the cases where I am least able to be the backstop, because they are easy to miss in a draft that otherwise reads clean. Those are the cases that must block in code, not flag for tired human eyes.

The limitation worth naming

These gates catch known failure classes. They do not catch everything. A draft that is confident, fluent, and simply wrong about a technical claim can pass every automated check, because none of them verify truth. That is not a gap I can close with another gate; it is the reason the human approval is non-optional and cannot be automated away. The honest design consequence is that Press optimizes for "easy to review" over "fully autonomous." Short, structured drafts with the answer up front are not only better to read; they are faster for a person to vet, which is the actual bottleneck. A system that drafts ten posts an hour but produces drafts I cannot quickly trust has not saved me anything.

The durable lesson is that automation earns its keep by changing where the human spends attention, not by removing the human from the decisions that cannot be undone. Press does not publish for me. It does the reading, the clustering, the drafting, and the catching, so that the one judgment only I can make is the one thing left to do.