Here's an easy, if not always precise way to remember:
* Hyphens connect things, such as compound words: double-decker, cut-and-dried, 212-555-5555.
* EN dashes make a range between things: Boston–San Francisco flight, 10–20 years: both connect not only the endpoints, but define that all the space between is included. (Compare the last usage with the phone number example under Hyphens.)
* EM dashes break things, such as sentences or thoughts: 'What the—!'; A paragraph should express one idea—but rules are made to be broken.
Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens such as soft and non-breaking hyphens, and a dedicated minus sign (U+2212), and some variations of minus such as subscript, superscript, etc.
There's also the figure dash "‒" (U+2012), essentally a hyphen-minus that's the same width as numbers and used aesthetically for typsetting, afaik. And don't overlook two-em-dashes "⸺" and three-em-dashes "⸻" and horizontal bars "―", the latter used like quotation marks!
AFAIK most computer keyboards don't have em dashes. Rather than hit ALT+0151 every time, I've always just strung along two hyphens, like: --
Absolutely proper and correct use of em dashes, en dashes, and hyphens is, to me, the most obvious tell of the LLM writer. In fact, I think that you can use it to date internet writing in general. For it seems to me that real em dashes were uncommon pre-2022.
Robert Bringhurst¹ prefers the en dash in the context of setting off phrases:
"The em dash is the nineteenth-century standard, still prescribed in many editorial style books, but the em dash is too long for use with the best text faces. Like the oversized space between sentences, it belongs to the padded and corseted aesthetic of Victorian typography.
"Used as a phrase marker – thus – the en dash is set with a normal word space either side."
* Use the minus sign /−/ (U+2212) when formatting numbers, because the default hyphen-minus /-/ (U+2D) just looks wrong: "It is −1 °C vs. -1 °C." Moreover, the correct minus has the same width as plus (− vs. +).
* Rare, but use the figure dash /‒/ (U+2012) or figure space / / (U+2007) if you need a placeholder character that is the same width as a single digit. For example, "Guess the PIN: 1‒34."
Somewhat off topic, however, I'm thoroughly convinced that there is a very high probability something is AI generated when I see Em dashes. Anyone else noticing this?
ChatGPT for example almost always uses them. I'm sure they are more common in academic writing, but its now super common on boards like Reddit.
It is one of those things that doesn’t really matter for readability, but although they can’t necessarily put a finger on why, people may still notice that some documents or pages appear to be set with more care for details than others.
(edit: I guess if you don’t have to search on Google what the hell a ‘Microsoft Word’ is, then you’re officially old)
I read Butterick's Hyphens and dashes some years ago and it stuck with me. Now I regularly use hyphens, en dashes, and em dashes correctly—I even memorized the Unicode sequences and enter them seamlessly on Linux with Ctrl-Shift-U!
Em dashes without surrounding spaces is such a ugly relic that triggers me to no end and is objectively wrong. The dash object is part of the sentence — not the two words it's separating.
We need a blog post documenting the ironic trend of people—themselves NPCs, actual human bots, just now realizing the em dash exists despite seeing it hundreds if not thousands of times before LLMs—flattering themselves by suggesting that anyone who understands the language at above a 5th grade level must be an LLM.
One point that is very rarely mentioned is how to place em dashes around quotations marks.
If the em dash indicates an interruption (not a planned pause) of the actual speech, the em dashes go inside the quotes (often just one, before the closing quote).
If the em dash is the narrator interjecting with additional information, the em dashes go outside the quotes.
Besides this, the question of where to put spaces when multiple forms of punctuation are combined can be quite a complex topic.
If you are looking for alternative to kebab case to write identifier in programming language which reserve the - (U+002d) as an operator, chances are good you can use · (U+00B7 · MIDDLE DOT), that we use in middot case.
So isMorePleasantToRead, is_more_pleasant_to_read or is·more·pleasant·to·read is up to you.
I like em dashes and use “Option Shift -” to summon them on macOS. However, LLMs tend to overuse them and compose absurdly long sentences. While proofreading a draft, I often instruct an LLM to “keep the original tone intact and don’t create overly complex sentences by fusing together simple ones.” That usually gets the job done.
Writers adores their em dashes. While they can sometimes clarify a concept by adding more context, overusing them can hurt readability. I prefer to read Hemingway-esque sentences that just say what they want to say and end sharply. So that’s how I write too—and sometimes the overuse of em dashes directly conflicts with that, making the content sound as if the author is confused about what they wanted to convey.
I use em dashes all the time in writing, but unfortunately ChatGPT and co. use the em dash frequently—and most people use the em dash infrequently, not knowing how to type it on a keyboard—so it's starting to make my writing look AI-generated sometimes. I fear it'll have to go the way of words like "tapestry."
FWIW, you can type an em dash on Mac with shift + option + hyphen.
> The en dash is the least loved of all; it’s not easily rendered by the average keyboard user (one has to select it as a special character, whereas the em dash can be conjured with two hyphens)
on macOS:
- - => - (hyphen/minus)
- ⌥ - => – (en dash)
- ⇧ ⌥ - => — (em dash)
There are so many of these convenient typographical shortcuts that a long time ago I made Apple layouts for Windows and Linux.
And many are mnemonic too, like:
- of course ÷ (division) is ⌥ / (slash, which is poor man's division)
- of course ¿ is ⇧ ⌥ / because ⇧ / is ? so logically ⇧ ⌥ / is ⌥ ? which is ¿
- guess what ≤ ≥ ± ≠ are
- ¬ (logical negation) is ⌥ L because it's a L sideways
- £ (pound) is ⌥ 3 because ⇧ 3 is # (octothorpe, abused as sharp or pound - the other kind)
I use em-dashes correctly because a reader emailed me, and I was dreadfully embarrassed. You can actually see them become correct in my writing after the "I will pile drive you" AI thing.
It never occurred to me that doing this correctly might make people think I use LLMs in my writing.
Edit: I'm sure the many typos protect me from that, actually.
> If you want to be official about things, use the en dash to replace a hyphen in compound adjectives when at least one of the elements is a two-word compound.
How is a literal dictionary making fun of people who "wanna be official about things" lol. That's the entire basis for dictionaries themselves
I had one minor quarrel with this article: The use of spaces (of any kind) before and after the em dash or any dashes.
Personally, I am fond of using either a hair space or a thin space before and after the em dash. Not a full space!
To explore the various options, I wrote a little program to print the various combinations of dashes and spaces. I think what looks best depends a lot on what typeface you're using. But let's see how they look in the Verdana font used here. You should be able to paste this into your favorite word processor to see it in other fonts:
ASCII 0x2D hyphen-with no spaces
ASCII 0x2D hyphen - with U+200A hair spaces
ASCII 0x2D hyphen - with U+2009 thin spaces
ASCII 0x2D hyphen - with 0x20 full spaces
Unicode U+2010 hyphen‐with no spaces
Unicode U+2010 hyphen ‐ with U+200A hair spaces
Unicode U+2010 hyphen ‐ with U+2009 thin spaces
Unicode U+2010 hyphen ‐ with 0x20 full spaces
Unicode U+2013 en dash–with no spaces
Unicode U+2013 en dash – with U+200A hair spaces
Unicode U+2013 en dash – with U+2009 thin spaces
Unicode U+2013 en dash – with 0x20 full spaces
Unicode U+2014 em dash—with no spaces
Unicode U+2014 em dash — with U+200A hair spaces
Unicode U+2014 em dash — with U+2009 thin spaces
Unicode U+2014 em dash — with 0x20 full spaces
It looks like HN is really mangling this. Hair spaces are rendered wider than thin spaces?
If anyone wants to experiment, here is the Python code:
from dataclasses import dataclass
@dataclass
class Character:
char: str
name: str
DASHES = [
Character( "-", "ASCII 0x2D hyphen" ),
Character( "\u2010", "Unicode U+2010 hyphen" ),
Character( "\u2013", "Unicode U+2013 en dash" ),
Character( "\u2014", "Unicode U+2014 em dash" ),
]
SPACES = [
Character( "", "no" ),
Character( "\u200A", "U+200A hair" ),
Character( "\u2009", "U+2009 thin" ),
Character( "\x20", "0x20 full" ),
]
for dash in DASHES:
for space in SPACES:
print( f"{dash.name}{space.char}{dash.char}{space.char}with {space.name} spaces\n" )
If you're on Windows, install PowerToys, and check out the KeyBoard manager. It lets you set up shortcuts. I overload my keys using right alt for greek letters. (science stuff). Could do it for these dashes as well.
I used a lot of these, but actually stopped due to my text sometimes being called out as chatgpt output. I also thorw in the occasional spelling mistake. If a piece of text on reddit/x has "–" (not "-") in it, you can be 95% sure it's an LLM.
For Windows users, PowerToys has a Quick Accent tool, that lets you type in an em dash or figure dash by holding down the hyphen (-) and then toggling the space bar. Interestingly, the en dash is not available.
I genuinely do not care one tiny bit about doing this right. At all. I will use the minus key for all of these like I always have and nothing bad will ever come of it. Find a better way to channel your limited energy.
0 0 000048 48 H LATIN CAPITAL LETTER H
1 1 00006F 6F o LATIN SMALL LETTER O
2 2 000077 77 w LATIN SMALL LETTER W
3 3 000020 20 SPACE
4 4 000074 74 t LATIN SMALL LETTER T
5 5 00006F 6F o LATIN SMALL LETTER O
6 6 000020 20 SPACE
7 7 000055 55 U LATIN CAPITAL LETTER U
8 8 000073 73 s LATIN SMALL LETTER S
9 9 000065 65 e LATIN SMALL LETTER E
10 10 000020 20 SPACE
11 11 000045 45 E LATIN CAPITAL LETTER E
12 12 00006D 6D m LATIN SMALL LETTER M
13 13 000020 20 SPACE
14 14 000044 44 D LATIN CAPITAL LETTER D
15 15 000061 61 a LATIN SMALL LETTER A
16 16 000073 73 s LATIN SMALL LETTER S
17 17 000068 68 h LATIN SMALL LETTER H
18 18 000065 65 e LATIN SMALL LETTER E
19 19 000073 73 s LATIN SMALL LETTER S
20 20 000020 20 SPACE
21 21 000028 28 ( LEFT PARENTHESIS
22 22 002013 E2 80 93 – EN DASH
23 25 000029 29 ) RIGHT PARENTHESIS
I'm just gonna say it: this does not matter. Just use whatever you want. If you're afraid that someone is going to think less of you for it: the people who matter won't.
2) using them without surrounding thin space or hairspace breaks the horizontal rhythm and draws unnecessary attention to the punctuation; but thin and hair spaces are equally hard to type
3) Most people write markdown with mono space fonts, making these dashes and spaces indistinguishable.
I could never remember which was the longer dash. Now it's easy, because the en dash – is the approximate length of a capital N, and a em dash — is the approximate length of a capital M. Today I Learned!
I use the hyphen key, and hit it once for a hyphen or for a minus sign, and I use it twice for an em dash.
At some point, many things I type into started replacing "--" with an em dash, but my precambrian computer typing muscle memory is fine with "hyphenhyphen" meaning "em dash".
I will admit right here in front of god & everybody that I'm pretty sure I've never typed an en dash at all.
Fun fact: In Portuguese, the em dash is often used to introduce direct discourse, much like double quotes are used in English, but only when the direct discourse opens the paragraph. So instead of:
I’m all about spelling things correctly. To, too, two or their, there, they’re matter. But using the correct dash/hyphen is way too pedantic to me. In isolation, I can’t tell the difference between them.
I simply do not care. I will just use - (the one next to zero on the keyboard) everywhere. There are a grand total of zero situations where using one in place of the other hampers information reconstruction or reading comprehension (although the latter is subjective, I suppose)
Simple reminder for those who don't know this: the easiest way to insert em-dash in Vim-supported editor (Evil-mode in Emacs) is to use digraphs feature. In insert mode you'd press Control+k, then type a digraph sequence. For em-dash is `C-k -M` — you literally type: "Control+K minus capital M".
For vanilla Emacs (without evil-mode), you can always do — "C-x 8 RET EM DASH" or "C-x 8 RET 2014". That's what "M-x describe-char" would tell you.
Most people don't use the em dash. It's too hard to type and looks too similar to a hyphen.
As a result, a hallmark of GPT-generated text is its (over)using of the em dash--I have stopped using it for this reason an just use two hyphens now instead.
How to Use Em Dashes (–), En Dashes (–), and Hyphens (-)
(merriam-webster.com)646 points by Stratoscope 27 March 2025 | 462 comments
Comments
* Hyphens connect things, such as compound words: double-decker, cut-and-dried, 212-555-5555.
* EN dashes make a range between things: Boston–San Francisco flight, 10–20 years: both connect not only the endpoints, but define that all the space between is included. (Compare the last usage with the phone number example under Hyphens.)
* EM dashes break things, such as sentences or thoughts: 'What the—!'; A paragraph should express one idea—but rules are made to be broken.
Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens such as soft and non-breaking hyphens, and a dedicated minus sign (U+2212), and some variations of minus such as subscript, superscript, etc.
There's also the figure dash "‒" (U+2012), essentally a hyphen-minus that's the same width as numbers and used aesthetically for typsetting, afaik. And don't overlook two-em-dashes "⸺" and three-em-dashes "⸻" and horizontal bars "―", the latter used like quotation marks!
Absolutely proper and correct use of em dashes, en dashes, and hyphens is, to me, the most obvious tell of the LLM writer. In fact, I think that you can use it to date internet writing in general. For it seems to me that real em dashes were uncommon pre-2022.
"The em dash is the nineteenth-century standard, still prescribed in many editorial style books, but the em dash is too long for use with the best text faces. Like the oversized space between sentences, it belongs to the padded and corseted aesthetic of Victorian typography.
"Used as a phrase marker – thus – the en dash is set with a normal word space either side."
¹https://archive.org/details/isbn_9780881791327/page/80/mode/...
* Use the minus sign /−/ (U+2212) when formatting numbers, because the default hyphen-minus /-/ (U+2D) just looks wrong: "It is −1 °C vs. -1 °C." Moreover, the correct minus has the same width as plus (− vs. +).
* Rare, but use the figure dash /‒/ (U+2012) or figure space / / (U+2007) if you need a placeholder character that is the same width as a single digit. For example, "Guess the PIN: 1‒34."
ChatGPT for example almost always uses them. I'm sure they are more common in academic writing, but its now super common on boards like Reddit.
At least we have dedicated O/0, and l/1 keys now. But we still see a lot of "straight" quotes instead of “those smart quotes Microsoft Word likes to generate”. And dashes. Did you know there is a dedicated ellipsis character? This is often set with slightly more space between dots than ..., and it by definition never wraps across a line between those dots. You still see (C) instead of ©.
It is one of those things that doesn’t really matter for readability, but although they can’t necessarily put a finger on why, people may still notice that some documents or pages appear to be set with more care for details than others.
(edit: I guess if you don’t have to search on Google what the hell a ‘Microsoft Word’ is, then you’re officially old)
https://practicaltypography.com/hyphens-and-dashes.html
Who omits the 1 from the second number?! That is aweful!
If the em dash indicates an interruption (not a planned pause) of the actual speech, the em dashes go inside the quotes (often just one, before the closing quote).
If the em dash is the narrator interjecting with additional information, the em dashes go outside the quotes.
Besides this, the question of where to put spaces when multiple forms of punctuation are combined can be quite a complex topic.
So isMorePleasantToRead, is_more_pleasant_to_read or is·more·pleasant·to·read is up to you.
Writers adores their em dashes. While they can sometimes clarify a concept by adding more context, overusing them can hurt readability. I prefer to read Hemingway-esque sentences that just say what they want to say and end sharply. So that’s how I write too—and sometimes the overuse of em dashes directly conflicts with that, making the content sound as if the author is confused about what they wanted to convey.
FWIW, you can type an em dash on Mac with shift + option + hyphen.
on macOS:
- - => - (hyphen/minus)
- ⌥ - => – (en dash)
- ⇧ ⌥ - => — (em dash)
There are so many of these convenient typographical shortcuts that a long time ago I made Apple layouts for Windows and Linux.
And many are mnemonic too, like:
- of course ÷ (division) is ⌥ / (slash, which is poor man's division)
- of course ¿ is ⇧ ⌥ / because ⇧ / is ? so logically ⇧ ⌥ / is ⌥ ? which is ¿
- guess what ≤ ≥ ± ≠ are
- ¬ (logical negation) is ⌥ L because it's a L sideways
- £ (pound) is ⌥ 3 because ⇧ 3 is # (octothorpe, abused as sharp or pound - the other kind)
It never occurred to me that doing this correctly might make people think I use LLMs in my writing.
Edit: I'm sure the many typos protect me from that, actually.
How is a literal dictionary making fun of people who "wanna be official about things" lol. That's the entire basis for dictionaries themselves
Personally, I am fond of using either a hair space or a thin space before and after the em dash. Not a full space!
To explore the various options, I wrote a little program to print the various combinations of dashes and spaces. I think what looks best depends a lot on what typeface you're using. But let's see how they look in the Verdana font used here. You should be able to paste this into your favorite word processor to see it in other fonts:
ASCII 0x2D hyphen-with no spaces
ASCII 0x2D hyphen - with U+200A hair spaces
ASCII 0x2D hyphen - with U+2009 thin spaces
ASCII 0x2D hyphen - with 0x20 full spaces
Unicode U+2010 hyphen‐with no spaces
Unicode U+2010 hyphen ‐ with U+200A hair spaces
Unicode U+2010 hyphen ‐ with U+2009 thin spaces
Unicode U+2010 hyphen ‐ with 0x20 full spaces
Unicode U+2013 en dash–with no spaces
Unicode U+2013 en dash – with U+200A hair spaces
Unicode U+2013 en dash – with U+2009 thin spaces
Unicode U+2013 en dash – with 0x20 full spaces
Unicode U+2014 em dash—with no spaces
Unicode U+2014 em dash — with U+200A hair spaces
Unicode U+2014 em dash — with U+2009 thin spaces
Unicode U+2014 em dash — with 0x20 full spaces
It looks like HN is really mangling this. Hair spaces are rendered wider than thin spaces?
If anyone wants to experiment, here is the Python code:
Also Merriam-Webster:
1) they are too hard to type.
2) using them without surrounding thin space or hairspace breaks the horizontal rhythm and draws unnecessary attention to the punctuation; but thin and hair spaces are equally hard to type
3) Most people write markdown with mono space fonts, making these dashes and spaces indistinguishable.
At some point, many things I type into started replacing "--" with an em dash, but my precambrian computer typing muscle memory is fine with "hyphenhyphen" meaning "em dash".
I will admit right here in front of god & everybody that I'm pretty sure I've never typed an en dash at all.
There's room for both: when presentation matters I use them; when it doesn't, I don't.
Do not use the Unicode characters, or people will think you are an AI bot.
"Hello," said John, "how are you today?"
You'd see:
— Hello — said John — how are you today?
For vanilla Emacs (without evil-mode), you can always do — "C-x 8 RET EM DASH" or "C-x 8 RET 2014". That's what "M-x describe-char" would tell you.
* em dash: ⌥ + ⇧ + - (alt + shift + hyphen)
* en dash: ⌥ + - (alt + hyphen)
As a result, a hallmark of GPT-generated text is its (over)using of the em dash--I have stopped using it for this reason an just use two hyphens now instead.
> comma, a colon, or parenthesis
They're all different. There is a difference between clear writing and typesetting. Why mix them up? A narcissism of small differences?