Yet the article goes about the most ass backward way of explaining 8086 segments and constructs a convoluted mental picture of dividing memory into overlapping chunks.
It's really, really simple: segments on the 8086/88 are 64k sliding windows into an 1M address space. You can move them around at 16 byte granularity.
You need more than 64k for code + data? No problem, the CPU knows when it's fetching an instruction vs when it's fetching data, you can have two sliding windows: code (CS) and data (DS). Split them apart, and it's not much different than a Harvard-style machine and gives you access to more than 64k at a time.
Still need more? No problem, the CPU has a hardware stack with dedicated push/pop/call/ret instructions and a base pointer for stack indexing. It knows when it's accessing the stack, so we can split the data window into regular data (DS) and stack data (SS). Oh, you occasionally want to copy stuff between segments or somewhere else in memory? Well, to encode 3 segments we need 2 bits anyway, let's throw in an extra data window (ES) and some DS-to-ES copy instructions.
I seem to recall at the time that flat memory was self evidently a better idea. It's not like people were sitting around going "gee I can't think of any better way to do memory addressing that this" until some genius suggests "how about flat?!?!?" Everyone knew flat was best but were stuck with 8086 crap.
1992-me hates the author. Coming from 68k assembly, x86 was a nightmare. And together with the ridiculous number of registers, segments made up a huge chunk of that horrible experience.
The segment model seems clever if you assume that you never have an object that is larger than 64kb. And once you have that you need to care about segment overflow, pointer comparisons no longer work, everything now has to carry around segment+offset instead of just offset, and so on. And if you want an example of a >=64kb object - the html alone for that page is one.
What is the difference between the segmentation model used by Intel and the banking model used by a lot of consoles? I've worked with the code of a couple of NES and GBC games, and while banking could be annoying, I never saw it as a particularly difficult model to follow and use. It did require more planning for the various functionality, but it wasn't even the most complex or difficult thing about developing for consoles.
The author assumes the way to keep segments going with larger memory would have been to change the amount of overlap, but it would have also been possible to make an 80286 where the segment registers were > 16bits, and everything else is 16 bits like before. Now you have extra segments that are still paragraphs apart and existing software could still function (you'd need new instruction variants to move data into and out of the enlarged portion of the segment registers.. call them ECS, EDS, or whatever). Anyway, just a thought.
It might have worked better if x86 had general-purpose registers where every register could work as a segment. Or maybe just many more segment registers. But with only two data segment registers to play with and quite cubersome (and slow!) loads, most software just chose not to bother.
They could have used 16 bit segments with no overlap. It would have a 16 bit offset register + a 16 bit segment selector register with the top 12 bits reserved (always 0). 16 bit software would run as usual in a single segment, while larger programs would use both registers for 20 bit addresses.
286 could then use the next 4 bits from the segment register to allow 16 MB address space and 386 could use all of them for 4GB. And wouldn't it be nice if 386 had 64KB pages (1 segment)?
Did anyone else find the AI written style of this offputting?
The original 20 bit vision of the 8086 was when memory was very expensive and they expected typical high end machines to have 128K of memory.
Intel’s assembler was designed so you could have up to 128K of code with a “shared” segment in the middle that either side could reach with near (16 bit only) pointers to call commonly shared routines, and more rarely executed code existed on either end.
In addition data could be its own segment, and/or memory mapped I/O outside of the 128K space.
But memory got so cheap that nobody bothered with this, and the performance gains of writing code that way wasn’t worth the effort. X86 code was compact enough most programs could cram their code into 64k anyway, or 64k per functional unit with calls between them being rare.
The real tragedy is they went for 20 bit instead of 24 bit. 8086 with 16MB of addressable space would have been a very different world and would have made little difference if there use. (Paragraphs would have been 256 bytes, the same size as a page; most data structures would have been fine with that.)
For its time it was a decent idea. Software was smaller and simpler. But today (and even before 64-bit) software is larger, more complex, we also need memory protection / isolation and more flexible memory allocation / sharing, so paging memory was not introduced for nothing.
I had to use it to do image processing on a 256MB image buffer back in the 1980s in assembly language. It was absolutely hideous. Give me a flat 32 bit memory address space any day (e.g. MC68000 around the same time.)
Could have been fixed with an ADC-type instruction that operated on segments.
Imagine if you could have done something like this:
add si, some-delta
adsc es, 0
in order to move a seg:ofs ptr forward by 'some-delta' bytes.
ADSC (add with segment carry) would do:
segreg := segreg + imm + 1000h (if carry)
or:
segreg := segreg + imm (no carry)
Maybe there should also have been an instruction to normalize a seg:ofs ptr (so the new offset was in the 0-15 range).
ADSC could have been adapted for the 286 with ease, as long as a specific layout of the segment descriptor tables was mandated (probably with 10h instead of 1000h in protected mode).
Edited slightly for clarity (ofs => imm).
A normalizing instruction would be harder to do right for the 286 because you don't want to spend too many slots in the descriptor table(s) for a single memory object.
Author comes from some weird assumption that software is some annoying byproduct of making hardware, rather than a fact that the hardware is made to run software and making it easier is a goal.
It was just a hack. Hack to delay migration to 32 bit architecture. Effective one, but hack nonetheless
Wow.... I remember writing 8086 assembly on MASM and another assembler I've forgotten the name of, and then also doing inline ASM in Turbo C++
The segment thing and the convoluted different pointer math caused real gymnastics if you ever had data bigger than 64k... such as images.
I always thought of the segments as windows of 64k but moving between those windows, esp with the limited register set, required some real mental gymnastics.
I blame 8086 segmented memory and the rest of its horrid architecture on why no one liked programming in assembly language. There were other elegant RISC machines with flat memory models and large general register sets that were a complete joy to program. Memory paging allowed you to do everything you needed to do that segmented memory provided and left the programmer unbothered for the most part.
Nope. It was bad. It made computers in the 286/386 eras having RAM above 1MB sitting there and doing nothing. It took years to transit to DOS/4G and then finally 32bit OS Windows 95.
> What we needed, in hindsight, was to treat segments as true selectors — opaque handles with no arithmetic meaning. If you can’t assume the next segment is 16 bytes ahead, you’re forced to use segmentation as intended.
Except we couldn't.
If we made each segment isolated from other, we would waste so much memory because memory are allocated in segment.
If we made each segment dynamic, we need something to manage them.
8086 Segmented Memory was a good idea
(owl.billpg.com)63 points by billpg 21 June 2026 | 140 comments
Comments
Yet the article goes about the most ass backward way of explaining 8086 segments and constructs a convoluted mental picture of dividing memory into overlapping chunks.
It's really, really simple: segments on the 8086/88 are 64k sliding windows into an 1M address space. You can move them around at 16 byte granularity.
You need more than 64k for code + data? No problem, the CPU knows when it's fetching an instruction vs when it's fetching data, you can have two sliding windows: code (CS) and data (DS). Split them apart, and it's not much different than a Harvard-style machine and gives you access to more than 64k at a time.
Still need more? No problem, the CPU has a hardware stack with dedicated push/pop/call/ret instructions and a base pointer for stack indexing. It knows when it's accessing the stack, so we can split the data window into regular data (DS) and stack data (SS). Oh, you occasionally want to copy stuff between segments or somewhere else in memory? Well, to encode 3 segments we need 2 bits anyway, let's throw in an extra data window (ES) and some DS-to-ES copy instructions.
286 could then use the next 4 bits from the segment register to allow 16 MB address space and 386 could use all of them for 4GB. And wouldn't it be nice if 386 had 64KB pages (1 segment)?
The original 20 bit vision of the 8086 was when memory was very expensive and they expected typical high end machines to have 128K of memory.
Intel’s assembler was designed so you could have up to 128K of code with a “shared” segment in the middle that either side could reach with near (16 bit only) pointers to call commonly shared routines, and more rarely executed code existed on either end.
In addition data could be its own segment, and/or memory mapped I/O outside of the 128K space.
But memory got so cheap that nobody bothered with this, and the performance gains of writing code that way wasn’t worth the effort. X86 code was compact enough most programs could cram their code into 64k anyway, or 64k per functional unit with calls between them being rare.
The real tragedy is they went for 20 bit instead of 24 bit. 8086 with 16MB of addressable space would have been a very different world and would have made little difference if there use. (Paragraphs would have been 256 bytes, the same size as a page; most data structures would have been fine with that.)
How is that compatible with an array and a simple implementation of the index operator?
Imagine if you could have done something like this:
in order to move a seg:ofs ptr forward by 'some-delta' bytes.ADSC (add with segment carry) would do:
or: Maybe there should also have been an instruction to normalize a seg:ofs ptr (so the new offset was in the 0-15 range).ADSC could have been adapted for the 286 with ease, as long as a specific layout of the segment descriptor tables was mandated (probably with 10h instead of 1000h in protected mode).
Edited slightly for clarity (ofs => imm). A normalizing instruction would be harder to do right for the 286 because you don't want to spend too many slots in the descriptor table(s) for a single memory object.
Segmented memory (on hardware that supported segment permissions) was used to good effect in Multics as well.
It was just a hack. Hack to delay migration to 32 bit architecture. Effective one, but hack nonetheless
No, it wasn't
It's the "great idea" that sounds great 5 min in and horrible 10min afterwards
You know, kinda like using null as a string end character
But more importantly it kept the x86 world for too long in that dead end that was 8086 mode programming
"Oh if developers would just..." They won't. They haven't. And they will not ever.
In hindsight maybe a binary level translator from 8080 to 8086 would have worked better (and be simple enough)
The segment thing and the convoluted different pointer math caused real gymnastics if you ever had data bigger than 64k... such as images.
I always thought of the segments as windows of 64k but moving between those windows, esp with the limited register set, required some real mental gymnastics.
Except we couldn't. If we made each segment isolated from other, we would waste so much memory because memory are allocated in segment.
If we made each segment dynamic, we need something to manage them.
This "hindsight" is just a MMU in disguise.