Moderating LessWrong

Duncan A Sabien

Context: LessWrong is a combination web archive and discussion forum that’s attempting to connect, inform, and service people with a strong interest in cognition, psychology, sociology, artificial intelligence, and worldsaving spirit. Its name is meant to embody both the optimism and the humility central to the project of human improvement: endeavoring to be “less wrong,” because in the near-term future that’s probably the highest target we can realistically hope to hit.

I’ve made a couple of attempts to become a part of the LW community, each ending in a fight, and usually one that includes strong disagreement with the mods and admins over how moderation and administration of a site called “Less Wrong” ought to work. Their philosophy is both discussed openly on the site and also visible in action; this post is meant to lay out my hypothetical counterfactual.

Note that practically everything below is my own personal take, and that I’ve never been a part of LessWrong’s moderation team; there’s no place where the following should be viewed as representative of the actual LessWrong.


Welcome to the mod team!

You’ve been selected as a moderator because we generally respect, approve of, and appreciate your style as a regular commenter. Your contributions to the site have been frequent and positive, and as your karma shows, you’re well-liked by the site’s members as a whole.

That means that, in general, you should trust your own instincts as you go about your moderator duties; we wouldn’t have given you mod powers in the first place if we didn’t think those instincts were sound. It’s better to be confidently yourself and let us correct you as needed than to self-censor and hesitate.

That being said, there are a few standards and guidelines that all mods and admin adhere to. Mostly, they center around setting, supporting, and signal-boosting the site’s unique culture—maintaining the “garden” of LessWrong so that it’s meaningfully different from the rest of the internet. This is a commitment we take very seriously, so please read the rest of this primer carefully (it takes maybe twenty minutes), and raise any questions you might have in the mod channel.


Axiom: a culture of clear thinking, accuracy, rational discourse, and collaborative truth-seeking is not the natural, default state of affairs.

By this, we mean that we believe the ideal LessWrong culture is endothermic—it requires a constant influx of attention and effort to maintain, or it will degrade back into a state more typical of the rest of the internet (c.f. the quality of interaction on Facebook, Tumblr, Reddit, 4chan, most blogs and their comment sections).

This is not actually a given; there are LW users who believe that active moderation is either irrelevant or overtly harmful. And obviously there’s a sense in which LW emerged from a bunch of people spontaneously desiring it and coming together to create it, and thus there’s an argument to be made against imposing any sort of hierarchical constraints upon it.

But between strawmanning, confirmation bias, the typical mind fallacy, evaporative cooling, identity politics, status games, Moloch, rounding off to stereotypes, inattention to detail, temporary lapses in judgment or self-control, tone-that-is-notoriously-difficult-to-interpret-through-text, et cetera, ad infinitum…

…there seem to us to be more than enough destabilizing forces to justify an active moderation philosophy, not to mention the fact that we have witnessed firsthand that things reliably spiral out of control when moderation is laissez-faire.

Therefore: mods and admin should consider it their personal responsibility to consistently and reliably take actions which promote clear thinking, accuracy, rational discourse, and truth-seeking collaboration, and which disincentivize, deter, or correct for anything else.

In short, what makes LessWrong different from everywhere else on the internet is that it’s a place where truth comes first, and that only remains true if we endeavor to keep it that way. More specifically, LessWrong exists to:

  • Foster the continued accumulation of true and correct information, especially about cognition, intelligence, and interactions between agents
  • Cultivate a culture that supports that accumulation by incentivizing curiosity, skepticism, clear communication, and collaboration
  • Cause readers and users to develop and adhere to strong personal norms of epistemic hygiene and truth-seeking behavior
  • Support and encourage attempts to make a positive impact on the future of humanity and the world
  • Be welcoming and inclusive of all who hold to the above ideals

Note that the order here matters — if truth comes into conflict with individual growth, truth wins (and it’s the job of the moderation team to ensure that truth wins). Similarly, if we have to choose between increasing inclusivity or increasing impact, impact wins; if we have to choose between increasing impact or improving the overall truth-seeking culture, culture wins. LessWrong is not a site for everything and everyone — it has a mission and a purpose, and its place in the broader ecosphere of rationalism, effective altruism, and the world at large is not served by compromising its core nature.


Two principles emerge from the above:

1. Every word on the site should be read by at least one moderator.
It’s not possible to keep a garden free of weeds unless the gardeners actually check for weeds. Thus, we’ve established a system for making sure that every thread and comment is read by at least one member of the mod team.

The next time you log on, you’ll notice that the site looks a little different than you’re used to. Every comment will either have a set of green initials to the right of the header, or will be highlighted in light purple with a checkbox.

If you’re not sure whether you’re claiming too many or too few threads, or if you have concerns or objections about the way that other mods are handling the threads that they have claimed, feel free to make a post in the mod channel. You can also tag a specific mod in the mod channel with the link to a given comment, if you suspect that their style is particularly well suited for that conversation.

2. No violation of site norms (epistemic or social) goes unaddressed.
A garden is a garden only to the extent that it is curated—LessWrong is defined by what it socially rewards, what it lets slide/tacitly endorses, and what it actively pumps against, just as a garden is defined by what gets cultivated and what gets weeded out.

Much of this curation happens without any intervention from the mod team. The LW userbase is generally competent and fairly highly motivated, and individual users do a lot to nudge one another toward better epistemics and more useful social norms.

However, each individual user comes to LW for their own reasons, and has their own personal set of opinions and priorities and preconceptions. That means that user enforcement is spotty, especially deep within subthreads or on high-context, subtle, or socially difficult posts. It’s the job of the moderation team to fill in the cracks—to be the ones that explain and enforce the norms and standards of the site where everyday users won’t (either because they or their argument benefits from the violation, or because they lack the time or energy or social capital to do anything about it).

The bar that we’re reaching for is perfection. We won’t make it all the way, and no one expects any individual user—including mods—to be perfectly rational all the time.

But in order for the site to fulfill its purpose—in order for us to in fact make progress on this whole being-less-wrong thing—we need a uniform policy of actually noticing shortfalls and consistently taking action in response.


Not in the ideal, no. “Taking action” can mean a wide variety of things; we’re not saying that every less-than-perfect comment needs to be openly flagged and discussed at length. As a moderator, you have the following tools at your disposal:

  • Upvote and downvote power (to signal-boost and disendorse posts)
  • Private messages (to offer corrections and suggestions that would be hard to swallow in front of an audience, or to ask questions and make requests, or to send a private warning)
  • Your own posts and comments (which not only respond directly but also set tone and culture for those who look to the mods for leadership)
  • The ability to summon other mods to the scene (both if you are uncertain how to proceed and if you need more voices in the mix)
  • Temporarily locking comment threads or subthreads (usually for 24h, which often helps to defuse a heated back-and-forth)
  • Permanently locking comment threads or subthreads (if all the value has already been gotten and all that’s left is demons)
  • The ability to sticky or hide posts and comments (stickies if they are one-in-a-hundred awesome or excellently display some particular rationalist virtue, and hides if you want to consider deletion but want to think on it first, or speak with the poster privately or with other mods)
  • The ability to edit other users’ posts and comments (almost exclusively used for de facto deletion, replacing the content with an explanation of why it was removed)
  • The ability to delete other users’ posts and comments (if you don’t even want to leave a trace)
  • Temporary bans (usually for 3d, if a user has not responded to three or more consecutive attempts to redirect their behavior)
  • Permanent bans (if a user has received two or more temporary bans in the same six month period without changing their behavior, or if a user has willfully and egregiously violated the site’s norms)

In many cases, you’ll be using one of your “silent” tools to solve the problem, especially if you and other mods have previously attempted to solve things through direct discussion and found that to be ineffective or insufficient. For instance, if it seems like the flaw in a given comment will be obvious even to newcomers (it’s overtly an ad hominem attack, for instance), often a downvote will be enough, especially if other users seem to be downvoting as well.

And in cases where you are posting a public comment in response to what you perceive to be a norm violation, you have a number of non-aggressive, non-escalating, non-threatening tactics available to you, such as:

  • Asking neutrally-phrased clarifying questions
  • Stating your own cruxes first/leading by example
  • NVC-ish language (I-statements, requests, openly acknowledging your own desires, emotions, and likely biases)
  • Explicitly acknowledging an opponent’s good intentions prior to criticizing their models or methods
  • Steelmanning (or at least avoiding strawmanning/rounding off)
  • Explicitly underlining your own uncertainty/credences/places where you are aware you might be incorrect

… just to name a few. Callout culture almost always eventually devolves to something like:

In contrast, the ideal LessWrong runs off something you might call collegial culture:

Collegial culture includes norms and standards and fences every bit as much as callout culture—colleagues are, almost by definition, people who share responsibility for a common endeavor, and thus can either be pulling their weight or falling short.

But collegial culture also includes a fundamental assumption of good faith—that the other person probably cares approximately as much as you do, that their values probably strongly overlap with yours, and that they’re probably intrinsically motivated to live up to something very similar to the standard you think is correct. It means that norm-violating behavior, instead of being seen as something evil that must be guarded against, is treated as a forgivable slipup that the other person probably wants to have brought to their attention, so long as the pointing-out isn’t just cover for an assault.

(c.f. the difference between honor culture and dignity culture)

Notice the details in the example above—they’re not random; most of them were put there deliberately and are doing important work. The phrase “appears to me to be” serves to highlight the critic’s awareness of uncertainty, that they may have misinterpreted things or missed detail. The framing of “thing we don’t do around here” is boundary-enforcing but not morally charged—it’s not that the behavior is fundamentally bad or wrong, just that it’s not part of our specific subcultural palette. The phrase “if I’m understanding you correctly” foreshadows a cruxif I’m understanding you, then I believe X, but if my understanding changes, I might not believe X anymore. The invocation of a “standard rationality move” (such as applying reductionism, checking the inverse of the hypothesis, or setting a five-minute timer) reinforces the shared culture of the site and models better behavior for newcomers. And the “thoughts?” at the end—especially if it occurs within a context where such invitations are demonstrably genuine, and not just lip service—actively draws the other person back into the conversation. The sum total of all of these little touches turns what might otherwise be the beginning of a fight into a cooperative, collaborative dynamic.


All of the above is meant to ensure that active moderation does not itself erode the social and epistemic norms of LessWrong—that the cure to a given instance of norm violation doesn’t end up worse than the disease, and that moderators don’t forget to hold themselves to the highest standard, too. It’s all too easy, in the midst of a righteous crusade to uphold some sacred value, to let oneself think that noble ends justify ignoble means, and that intentions matter more than consequences.

That being said, if gentle methods prove insufficient to solve a problem, an escalation toward more serious interventions is not only appropriate, but essential. A user willing to violate site norms has more weapons at their disposal than a user following the rules; if the moderators do not take decisive action to defend individuals engaging in [clear thinking and rational discourse] from those engaging in [run-of-the-mill internet fuckery], then the former group will self-protectively leave or go silent, which then worsens the situation in a spiral of evaporative cooling.

Two long quotes from Harry Potter and the Methods of Rationality serve to highlight this point fairly well:

“There was a Muggle once named Mohandas Gandhi,” Harry said. “He thought the government of Muggle Britain shouldn’t rule over his country. And he refused to fight. He convinced his whole country not to fight. Instead he told his people to walk up to the British soldiers and let themselves be struck down, without resisting, and when Britain couldn’t stand doing that any more, we freed his country. I thought it was a very beautiful thing, when I read about it, I thought it was something higher than all the wars that anyone had ever fought with guns or swords…Only then I found out that Gandhi told his people, during World War II, that if the Nazis invaded they should use nonviolent resistance against them, too. But the Nazis would’ve just shot everyone in sight…

The point is, saying violence is evil isn’t an answer. It doesn’t say when to fight and when not to fight. It’s a hard question and Gandhi refused to deal with it, and that’s why I lost some of my respect for him…One answer is that you shouldn’t ever use violence except to stop violence. You shouldn’t risk anyone’s life except to save even more lives. It sounds good when you say it like that. Only the problem is that if a police officer sees a burglar robbing a house, the police officer should try to stop the burglar, even though the burglar might fight back and someone might get hurt or even killed. Even if the burglar is only trying to steal jewelry, which is just a thing. Because if nobody so much as inconveniences burglars, there will be more burglars, and more burglars. And even if they only ever stole things each time, it would — the fabric of society — ”

Harry stopped. His thoughts weren’t as ordered as they usually pretended to be, in this room. He should have been able to give some perfectly logical exposition in terms of game theory, should have at least been able to see it that way, but it was eluding him. Hawks and doves —

“Don’t you see, if evil people are willing to risk violence to get what they want, and good people always back down because violence is too terrible to risk, it’s — it’s not a good society to live in, Headmaster! Don’t you realize what all this bullying is doing to Hogwarts, to Slytherin House most of all?”


“Wolves, dogs, even chickens, fight for dominance among themselves. What I finally understood, from that clerk’s mind, was that to him Lucius Malfoy had dominance, Lord Voldemort had dominance, and David Monroe and Albus Dumbledore did not. By taking the side of good, by professing to abide in the light, we had made ourselves unthreatening. In Britain, Lucius Malfoy has dominance, for he can call in your loans, or send Ministry bureaucrats against your shop, or crucify you in the Daily Prophet, if you go openly against his will. And the most powerful wizard in the world has no dominance, because everyone knows that he is a hero out of stories, relentlessly self-effacing and too humble for vengeance…

In Hogwarts, Dumbledore does punish certain transgressions against his will, so he is feared to some degree — though the students still make free to mock him in more than whispers. Outside this castle, Dumbledore is sneered at; they began to call him mad, and he aped the part like a fool. Step into the role of a savior out of plays, and people see you as a slave to whose services they are entitled and whom it is their enjoyment to criticize; for it is the privilege of masters to sit back and call forth helpful corrections while the slaves labor…

I understood that day in the Ministry that by envying Dumbledore, I had shown myself as deluded as Dumbledore himself. I understood that I had been trying for the wrong place all along. You should know this to be true, boy, for you have made freer to speak ill of Dumbledore than you ever dared speak ill of me. Even in your own thoughts, I wager, for instinct runs deep. You knew that it might be to your cost to mock the strong and vengeful Professor Quirrell, but that there was no cost in disrespecting the weak and harmless Dumbledore.”

LessWrong is a site founded on fairly liberal ideals. No one wants to have to do things like get into fights, or delete posts, or lock threads; in general, we’re all on board with the idea that sunlight is sanitizing and the marketplace of ideas should be as free as we can sustainably make it.

But that can’t translate into an unwillingness to draw clear lines or defend important principles. Well-kept gardens die by pacifism—it’s crucial that the moderators not all be Dumbledores; that there be at least some Luciuses and Quirrells on the side of the light, people who are not only willing but known to be willing to step into the fray and unashamedly use their power to edit, lock, and delete. In no small part, the duty of the moderation team is to ensure that no LessWronger who’s trying to adhere to the site’s principles is ever alone, when standing their ground against another user (or a mob of users) who isn’t—and sometimes, that means openly taking sides against certain behavior.

When should moderation switch from gentle/charitable/correction-oriented to direct/skeptical/enforcement-oriented? There’s no clear answer, and each mod will have their own personal standard—some may never take the gloves off, while others may be quick to theorize hostile intent or slippery slopes. There’s open discussion on the mod channel of specific instances and whether the changeover was too early or too late; feel free to click back through the archives to see what you think of various examples.

In general, though, if you’re looking for a simple, paint-by-numbers answer, a good rule is something like three strikes within a given conversation. In other words, a first violation should almost always be treated as no-fault, and responded to in good faith. A second violation on the heels of the first is concerning, but could still easily be the result of miscommunication or a temporary flare in temper, and is often solved by an even greater dose of patience/kindness/charity that creates space for the user to catch their breath and not feel like they have to have shields up and weapons charged.

After a third violation, though, we usually recommend a) calling in another moderator for a second opinion, and b) if they concur, switching toward more no-nonsense action like posting a clarifying comment and then locking the thread.


It helps to get a few concrete examples, so for this next section we’re going to zero in on a string of comments from a recent thread in response to the linkpost In Defense of Punch Bug.

(Author’s note: In Defense of Punch Bug was written by me, and crossposted to LessWrong with my implicit permission. I no longer post on LessWrong, but I do occasionally lurk there, and I noticed the string of comments below and was somewhat dismayed by the lack of objection to what I believe to be clear norm violations. In many ways, that was a motivating factor for finishing this essay sooner rather than later, and the section below contains my own, real, human response to what I saw as some unfair, unanswered attacks. I’m doing my best to keep my defense principled and truth-oriented, but I own the fact that this is in no small part a direct response to a specific individual.)

In response to the post, user benquo writes, as a top-level comment:

I’m very worried about people unilaterally claiming the right to initiate physical violence against me with impunity. (It’s additionally worrying when the occasion for this is being reminded of a prominent German brand; ironically, I have the exact same worry about the “punch Nazis” advocacy I was seeing on the internet a few months ago, given the general unwillingness to work out a legible standard by which Nazis might be identified.)

Asymmetric “no punch back” rules are special, and there’s a long history of such things being used to build momentum towards mass murder. The Jewish holiday of Purim specifically commemorates the political maneuvering required to persuade the Great King of Persia to specifically disavow a “no punch back” rule implied by a prior edict authorizing a day of pogroms. The prior edict was not repealed, but the repeal of the “no punch back” rule was considered sufficient cause for repeating the celebration annually.

The casual acceptance and signal-boosting of a post like this is a good example of why I don’t feel safe in the San Francisco Bay Area Rationalist community.

From the perspective of a moderator, there are several yellow flags in the above—things which probably ought to live outside the Overton window of the site, and which, if left unaddressed, set an unpleasant precedent for future conversations. For instance:

  • The summary of the post in the first line rings as fairly uncharitable, and in fact is not justified by the content of the piece it’s referring to. Additionally, the phrase ‘physical violence’ is quite broad, and may be the beginnings of a motte-and-bailey maneuver, whether intentionally or otherwise.
  • While there’s nothing wrong with having a knee-jerk response to the word ‘Volkswagen,’ and indeed benquo is demonstrating upper-quartile self-awareness and candor in mentioning it, there’s also no corresponding visible self-doubt or metacognitive suspicion of the effect that response might have on his later reasoning and conclusions.
  • The implication that the rules of punch bug create a persistent and dangerous asymmetry is not sufficiently justified in context, and the attempt to draw a line from that to pogroms is naturally inflammatory and the sort of thing that deserves a lot more care/attention/explanation.
  • The overall extremity of both the chosen frame and the ultimate conclusion (fear for one’s physical safety) seems incongruous given the context of a blog post defending a game played by schoolchildren, and in a way that might either intentionally or accidentally make an epistemic conversation much more difficult to have.

The expectation in a case like this, where a reasonably well-known and well-respected user is setting the frame for what will likely be a large conversation, is that one or more mods would respond to each of these threads*, seeking to either a) draw out more of the thinking behind them that makes them ultimately reasonable, b) zero in on cruxes that imbue the beliefs with appropriate uncertainty and could theoretically cause them to pivot rather than be firmly held, or c) clearly identify the point at which bias or flawed reasoning caused benquo to go astray.

In short, to encourage benquo to elaborate if true, make falsifiable if uncertain, or post-mortem if false.

*In the actual LW discussion, multiple moderators did in fact respond to this comment, but several of the listed flags went entirely unaddressed. This is where coordination in the mod channel comes in handy—if you think you’re seeing violations of a type that other mods aren’t noticing and helping with, start a thread and we’ll try to arrive at an understanding of where the ‘definitely respond or intervene’ line should be. Some of the more common gray-area violations that have been discussed recently as possibly being under-addressed include:

  • A user clearly missed or ignored relevant details of the post or comment they are responding to, and is reacting to a preconceived stereotype rather than what’s actually there.
  • A user is summarizing another user’s position with specifically charged language that that user would not agree adequately describes their views.
  • A user is ignoring explicit, within-norms requests from another user (e.g. to restate something in greater detail, or offer cruxes, or to attempt to pass an ideological Turing test) without explanation and while selectively carrying on the parts of the conversation where their argument is strongest.
  • A user is attacking the style or delivery with which another user attempted to convey a point, without ever attempting to address the point itself.
  • A user is stating conclusions as if they are fact without connecting the dots/revealing the reasoning behind them, or is conflating their personal beliefs, opinions, and sense-of-things with reality and neglecting to highlight credences or uncertainty.

For example, a moderator seeking to engage in good faith with benquo’s comment might write:

“It makes a lot of sense that the word ‘Volkswagen’ would produce a visceral reaction, given the context of your Jewish heritage [that you referred to explicitly elsewhere in the thread]. But on reflection, do you endorse that as relevant to your ultimate sense of whether-or-not-you’re-under-threat? I’m not sure how to distinguish between trustworthy subconscious intuition and bias, in this case—do you have a heuristic that helps you decide?”

or

“I’m not sure about your use of the word ‘asymmetric’ given the context of the post. It seems to me to evoke things like ‘dalits can’t punch brahmans,’ or ‘women can’t defend themselves against men,’ or other systemic inequalities that delegitimize and undermine a whole group. Are you making the claim that a society where people play punch bug sets up a similar sort of dichotomy between two groups? If so, I’m curious what the groups would be; if not, I’m wondering if you’re willing to elaborate.”

or, as an actual LW moderator posted:

“It might help if you pointed at the groups you think the asymmetry is between, as I suspect you and SilentCal are imagining different lines here.

I think you see the asymmetry as being between ‘people who want to punch others’ and ‘people who don’t want to punch others,’ as only the first group sees any possible value from punch bug (in the short term), and SilentCal sees the two people as ‘the person who saw the bug first’ and ‘the person who didn’t see it,’ where the only asymmetries are related to people’s abilities to spot bugs (and thus playing punch bug with the blind would raise these sorts of symmetry concerns).”

Responses like these serve multiple purposes. They reinforce the fundamentally truth-seeking nature of the conversation, steering it away from a slippery slope toward ideological grandstanding. They provide benquo with a chance to say ‘yes, that,’ or ‘no, you’ve missed me.’ They ground the discussion back in facts and cruxes instead of in large conclusions drawn ineffably from personal feelings. And they give moderators the grounding for later objection to continued violations of epistemic hygiene—if early and small violations are not noted and explicitly addressed, then it’s hard to blame the user for executing similar motions later on, especially if they’re being upvoted.

It’s worth noting that this is one place where wearing your “moderator hat” might cause you to respond differently than you would if you were simply engaging as an interested user. As a user, you’re free to tug on whichever thread you find most engaging, and ignore the rest. As a moderator, though, you’re holding down the cultural pole, and if you note one objection with a comment and say nothing of the rest, that will be taken as a tacit endorsement and consequentially legitimize whatever tactics were in play (such as strawmanning). Indeed, one senior moderator in the LessWrong discussion, after objecting to a few words that benquo later retracted, went on to say of that first comment:

To be clear, I think the substance of the comment is perfectly fine, and think the subsequent discussion has been well within bounds as-of-the-time-I-write this

… which almost certainly caused others who might otherwise have raised valid objections to pause and reconsider, and which plausibly gave benquo moral authority to continue in the same vein.


Here it’s worth pausing to note that, while the bulk of the learning opportunities come from looking at benquo’s more problematic comments, there were several places where he both provided unique insight and held very close to ideal discussion norms, and it’s important to recognize and applaud those, as well as noting that his overall intention is clearly philanthropic and altruistic. For example, one user posted the following:

The only thing that aligns [punch bug] with the pogroms is the involvement of physical violence — and even then, I’d suspect most people would plot ‘punch in the arm’ closer to ‘annoyingly loud music’ than to ‘mass murder’ on the scale of harms.

… to which benquo replied:

A friend of mine recently suffered a concussion after being punched on the street. It was cognitively compromising for a couple of weeks. Maybe you just think he’s oversensitive and that if he got concussed more often he’d learn to just roll with it, but if you’re willing to accept for the sake of argument that perhaps a particularly hard punch can cause substantial physical injury worth worrying about, it seems pretty bad to play a game that trains people not to react to street assault.

… this seems to us to be a model response; it highlights a tail risk that’s certainly at least worth considering, makes room for a possible objection in a reasonably neutral (if somewhat snarky) manner, and provides the outline of a causal model that convincingly explains the difference between his reaction and the reaction of the previous commenter.

However, it’s important to remember that we’re not checking the user; we’re checking the content. It’s easy to fall into the trap of letting things slide for solid, prolific commenters (or ignoring the rare wisdom of people on the trollish, uncooperative end of the spectrum). But weeds and flowers don’t cancel one another out—regardless of whether someone’s made a thousand excellent comments or none, the standard of perfection requires us to try to notice and respond to their non-excellent comments in a consistent and principled manner.


Moving downstream in the discussion, another user reiterated confusion about the word ‘asymmetry’ by saying “I don’t see what’s asymmetric about the ‘no punch back’ rule at all — the punchee is free to spot the next bug, in which case they will become the beneficiary of the ‘no punch back’ rule.”

To which benquo responded with:

Is it hard for you to imagine that some people might not be violent sadists?

This is one of the cases in which an explicit, verbal response might not be necessary or productive; the comment’s misstep is fairly clear and it was downvoted to negative territory fairly quickly. However, if you did wish to respond, for the sake of publicly clarifying LW’s discourse norms, you might do so with something like the following:

“I notice that you’ve probably got a useful point, here, but I’m having to do work to draw it out of what otherwise seems like a pretty inflammatory statement. Like, it seems that you’re responding to a request for clarification with the implication ‘anyone who doesn’t already see this is a violent sadist.’ I imagine that’s not the conclusion that you want me to draw—would you be willing to write a paragraph instead of a sentence, and maybe note that other people might be reading you as making something like the fundamental attribution error?

Hopefully, the theme is becoming clear—holding a firm line against all of the ways in which ordinary everyday anti-epistemic bullshit threatens to gum up the unique value of a LessWrong discussion. The point is not to attack or alienate a user who is using those tactics, but rather to guide them toward expressions of their contribution that don’t lean toward instrumentally viable but epistemically unsound tactics like hyperbole, pearl-clutching, strawmanning, and various other gotchas and social tricks.

And it is the moderator’s job to make sure that line is held everywhere. To be present and attentive enough to be the one to catch the slipped-in hidden assumption in the middle of the giant paragraph, and to point out the uncharitable summary even when it’s carefully and politely phrased, and to follow the subthread all the way down to its twentieth reply—

—or to lock it, if you believe that continuing the conversation is net negative. What you cannot do, as a moderator, is just shrug and let the weeds grow as they will. Elsewhere in the thread, benquo wrote:

I want to note that one and only one side of this debate has argued for initiating physical violence here. The side doing that is not mine. (emphasis in original)

and

Duncan makes it clear that the reason he doesn’t punch people hard is because he doesn’t think he can get away with that right now, not because he wouldn’t prefer that norm.

and

To be fair, he does suggest that people who don’t want to play Punch Bug be accommodated with permission to live in a ghetto instead.

… each of which went unremarked and unchallenged, and each of which is the sort of thing which cannot go unremarked and unchallenged if LessWrong is to become what it is trying to be. In the hypothetical world where benquo received all of the gentle nudges above and still wrote these three lines (in other words, in the world where he’d already implicitly rejected multiple requests for a change in tone and method), responses from the mod team should become less cooperative and more direct in turn:

“It’s also worth noting that typing the phrase ‘initiating physical violence’ in bolded and italicized text to refer to acts that don’t even merit a middle school detention is exactly analogous to people who put catcalls and rape in the same bucket. I’m not sure why you would elide those differences when they’re extremely important; if you’re trying to say something more complex like ‘this is the start of a slippery slope that ends with serious violations,’ then say that, and explain the lines of causality that make it so. As it is, this reads as disingenuous and undermines your credibility as a clear thinker.

and

“While it seems justifiable (though uncharitable) to hold that as a hypothesis about Duncan’s motives, given the text of his essay, the phrase ‘makes it clear’ is a definite overreach. There’s plenty of ambiguity around whether or not that’s his actual position, and plenty of alternative hypotheses that fit the data equally well—did you actually bother to check?”

and

“This comment has been deleted for violating the LessWrong standards of discourse. A warning has been issued to benquo and further violations will result in a temporary ban.”


One thing that has been mostly implicit thus far is the degree to which moderation is not just about the person in front of you. On the hierarchy of LessWrong’s goals, culture supersedes individual growth. That’s in part because the endeavor of actually trying to figure things out is larger and more important than any one person, and in part because the individual updates are—well, up to the individual.

Individual users will do what they want to do, within the bounds of what’s allowed. They’ll say what they want to say, vote as they want to vote, endorse or decry or ignore at will.

But as a moderator, you speak with a louder voice, on behalf of a larger organism. It’s your job to be principled in your distinction between what counts as a flower and what counts as a weed. Each time you upvote or downvote, each time you comment, each time you lock or edit or delete a thread, you’re curating LessWrong’s culture. Even if you fail to change the mind of the person you’re directly responding to, your words will be read and weighed by dozens or hundreds of other people, who will take them as a model and a template.

There’s a way to write those words well, and there’s a way to do it badly. Remember, you’re new at this, and we know that you’re new at this. No written guide could prepare you to get it right every time; if such a guide existed we’d be much less desperately in need of a project like LessWrong in the first place. No one expects you to be perfect ever, much less right away.

But right from the start, there’s one thing you can do, and that’s to keep this question always at the front of your mind:

“What sorts of conclusions will the userbase draw—about what LessWrong likes and doesn’t like, permits and forbids, and ultimately is ‘for’—from the action I’m about to take? What am I tacitly encouraging, and what am I discouraging, and what sorts of people will that tend to repel or attract, over time? Is the resulting version of LessWrong the one I want to be a part of?”

Upvote the unfairly downvoted, and signal-boost what they were doing right.

Downvote the erroneously upvoted, and take a stand against tribalism or goalpost-shifting or whatever kind of point-scoring caused people to overlook the truth.

Take the comment that has both, and unpack it so that people remember that things can be complicated, and that everything doesn’t have to resolve to a single karma score.

Take a stand to defend yourself—from error, from falsehood, from strategic disingenuousness—to weaken the stigma around defensiveness and invalidate the totally batshit crazy norm that it’s somehow gauche or distasteful or low-status to refuse to let yourself be unfairly steamrolled by people wielding weaponized bad epistemics.

Spend your social points—there’s no point in having them if you don’t do good work with them. If every important principle were popular, we wouldn’t need mods; sometimes it’s going to be rough because you’re pushing in the right direction.

And above all, remember that you’re not alone. We’re a team, and we’ve got your back—like we said at the beginning, we wouldn’t have put you in this role if we didn’t already think you have what it takes.

Welcome, and good luck.


REMINDER: all of this is hypothetical; I am not and have never been a part of the LessWrong team. To learn more about the current moderation policies of the actual LessWrong, go take a look at their front page.

Duncan A Sabien

Written by

Duncan Sabien is a writer, teacher, and maker of things. He loves parkour, LEGOs, and MTG, and is easily manipulated by people quoting Ender’s Game.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade