Giter Site home page Giter Site logo

hackclub / putting-the-you-in-cpu Goto Github PK

View Code? Open in Web Editor NEW
4.6K 26.0 134.0 6.95 MB

A technical explainer by @kognise of how your computer runs programs, from start to finish.

Home Page: https://cpu.land

License: MIT License

JavaScript 1.11% Astro 15.64% MDX 74.42% TypeScript 0.32% CSS 8.51%
cpu elf linux linux-kernel

putting-the-you-in-cpu's Introduction

Putting the "You" in CPU

A technical explainer of how your computer runs programs, from start to finish.

by @kognise and @hackclub


From the beginning...

I've done a lot of things with computers, but I've always had a gap in my knowledge: what exactly happens when you run a program on your computer? I thought about this gap — I had most of the requisite low-level knowledge, but I was struggling to piece everything together. Are programs really executing directly on the CPU, or is something else going on? I've used syscalls, but how do they work? What are they, really? How do multiple programs run at the same time?

A scrawled digital drawing. Someone with long hair is confused as they peer down at a computer ingesting binary. Suddenly, they have an idea! They start researching on a desktop computer with bad posture.

I cracked and started figuring as much out as possible. There aren't many comprehensive systems resources if you aren't going to college, so I had to sift through tons of different sources of varying quality and sometimes conflicting information. A couple weeks of research and almost 40 pages of notes later, I think I have a much better idea of how computers work from startup to program execution. I would've killed for one solid article explaining what I learned, so I'm writing the article that I wished I had.

And you know what they say... you only truly understand something if you can explain it to someone else.

In a hurry? Feel like you know this stuff already?

Read chapter 3 and I guarantee you will learn something new. Unless you're like, Linus Torvalds himself.


Continue to Chapter 1: The "Basics" »
(cpu.land)

putting-the-you-in-cpu's People

Contributors

chocorho avatar davidwalschots avatar dependabot[bot] avatar kognise avatar nklymok avatar omerbaddour avatar remicmacs avatar terjewiigmathisen avatar ulfsauer0815 avatar volker-weissmann avatar wkhere avatar zachlatta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

putting-the-you-in-cpu's Issues

Coop mutitasking correction/clarification

I just read chapter 2 and I really like the article so far.

I wanted to add a bit of clarification regarding the coop multitasking:

Rather than the OS deciding when to preempt programs, the programs themselves would choose to yield to the OS. They would trigger a software interrupt to say, “hey, you can let another program run now.” These explicit yields were the only way for the OS to regain control and switch to the next scheduled process.

The OS would use any system call to check if the process has consumed its time slice and to switch to another process. The yield system call was there for the case, where your program doesn't need to make system calls (e.g. it's doing some number crunching and all the data it needs is in memory already), but you still want to be a good citizen and to give the other processes a chance to run.

Potentially unclear explanation of register usage in chapter 4 - Becoming an elf lord

The kernel is almost ready to return from the syscall (remember, we’re still in execve). It pushes the argc, argv, and environment variables to the stack for the program to read when it begins.

The registers are now cleared. Before handling a syscall, the kernel stores the current value of registers to the stack to be restored when switching back to user space. Before returning to user space, the kernel zeroes this part of the stack.

Finally, the syscall is over and the kernel returns to userland. It restores the registers, which are now zeroed, and jumps to the stored instruction pointer. That instruction pointer is now the starting point of the new program (or the ELF interpreter) and the current process has been replaced!

When I first read this I was confused as to the order of operations. After a few reads I thought maybe it went like this

  1. execve starts
  2. register values copied onto stack
  3. execve almost finishing up
  4. register values copied back into registers
  5. memory that held those values is zeroed

I was going to open a PR correcting this but then realised I wasn't sure if I was right. Is it actually that the memory is zeroed and then zeroes are copied back into the registers? that didn't seem right to me ("It restores the registers, which are now zeroed")

Anyway, would love to be corrected!

P.S. thanks so much for these blog posts, they're awesome

Dark mode

Adding a dark mode feature for a site intended for reading can be useful. I would be happy to take it up and contribute here if needed.

EPUB Version

If possible, I would appreciate it if you could provide an EPUB version of this text as well. I think this format might be more suitable for reading on electronic devices. Thank you for taking the time to make this useful information available to me. I hope you can provide an EPUB copy too.

Should I mention PC?

From dreamcompiler on HN:

"The CPU stores an instruction pointer which points to the location in RAM where it’s going to fetch the next instruction."

This is also called the Program Counter or PC outside the Intel universe. This is confusing as "PC" also stands for "Personal Computer" but people who learned computing in the days before Intel became popular still call it the PC register.

My response:

I know about the Program Counter terminology, and explicitly chose not to use it to be more architecture-independent... but maybe it was a mistake not mentioning it at all, considering it's such absurdly prevalent terminology.

Cooperative multitasking is still a thing

Hi,

Just noted in the end of Chapter 2 it reads “ For these reasons, the tech world switched to preemptive multitasking a long time ago and never looked back.”
It looks like a bit of overstatement because concepts like coroutines and green threads are quite popular nowadays, and we even have quite popular languages built on top of that (Go with its goroutines).
Appreciate language runtime-level cooperative multitasking is not the same as OS-level, but still worth mentioning I think.

Overall, a great write-up that I very much liked to read!

thanks!

Questions about Chapter 1

Programs can’t directly switch privilege levels; hardware interrupts are safe because the processor has been preconfigured by the OS with where in the OS code to jump to.

This is the first time a hardware interrupt is mentioned. Does it mean other kind of interrupts exist too? Are there any differences in how programs use them?

When this kernel code finishes, it tells the CPU to switch back to user mode and return the instruction pointer to where it was when the interrupt was triggered. This is accomplished using an instruction like IRET.

From this, my understanding is that IRET is used by kernel code to transfer control back to user space. But that seems to contradict with this paragraph:

Programs can delegate control to the OS with special machine code instructions like INT and IRET.

Can user code call IRET? But it's already in the user space, and it can't access kernel space, so how does that work?

Configure DNS with hackclub/dns

Right now cpu.land is has DNS managed using the [email protected] Google Domains account. We want to switch this to having DNS managed using https://github.com/hackclub/dns.

Here are the steps to do that:

  1. Share the domain with [email protected] on Google Domains (this is not required for hackclub/dns, just so I have access too)
  2. Set up DNSimple to manage cpu.land's records (credentials in 1Password)
  3. Set up hackclub/dns to use the DNSimple API to manage the domain using OctoDNS. 90% of the work should already be done, just look at how other Hack Club domains are managed in that repo.

Thank you!!

Possibility of discussion of ld.so/dyld/etc. behavior

I believe the end of chapter 4 deserves a quick discussion of how control gets from the entry point of the dynamic linker to the entry point of the executable. Including these fun tidbits (for some values of fun, anyway):

  • The dynamic linker needs to manage data structures, allocate memory, and perform an awful lot of string operations in particular. So it needs access to libc functionality. But it can't use the shared libc everyone else uses: it is going to require that functionality prior to being able to load any dynamic library itself! As a result, the dynamic linker has its own copy of (a subset of) the libc statically linked into it: its only dependency is, understandably, the kernel. This is one of the reasons why on Linux the dynamic linker is actually provided by the folks who provide the libc. And this is the reason all static linkers still need to support building fully self-contained, statically linked binaries, where even system libraries are statically linked (which is discouraged for almost all code): in order to build the dynamic linker itself.
  • While the kernel is responsible for interpreting the ELF commands for the executable and the dynamic linker (if applicable), on the other hand it is not in charge of interpreting the dynamic libraries themselves: the only visibility it has into these is the mmap() calls, performed by the dynamic linker, specifying (a subrange of) them as backing, allowing that memory to be shared cross-process. This means the dynamic linker has to have its own ELF parser, independently of the kernel's: everything else with regard to loading dynamic libraries in memory is its responsibility.
  • That a process is provided its own address space for exclusive use enables code in the main executable to be compiled in a position-dependent fashion. At least, in theory: security considerations such as ASLR mean most executables are position-independent these days. But dynamic libraries have no such choice and must consist of position-independent code because, even if there are systems for preferentially loading them at a certain address, there is no guarantee that this virtual address range will be available by the time they are loaded: another dynamic library might have been loaded there first for instance. In which case the bumped dynamic library will need to be loaded at a non-preferred virtual address and work anyway.
  • .init and .fini sections
  • for bonus points, the GOT, the PLT, and relocation entries.

Feedback from still_grokking on HN

The only thing that I miss a little bit is a kind of "disclaimer" that what gets presented is "just" the result of a market race, and not how computers need necessary to work like. Not going into the details of possible hardware architectures and implementation, but even on the "user facing" level (the operating system and application layer) things can look very, very differently. Just as an example: https://en.wikipedia.org/wiki/Genera_(operating_system)

PDF version

Thanks for this awesome source of information.

Do you think it is feasible to also "release" a PDF version of it?

Is this correct?

The chapter on multitasking says -

"The target latency should be equal to the time it takes for a process to resume execution after being preempted."

Is this correct? AFAIK the target latency only needs to be atleast as long as the time it takes for a process to resume execution after being preempted.

PNG image optimization

I'm currently working on optimizing PNG images used for the website by lossy (not the JPEG kind) means of converting each from RGB to indexed colors, and reducing it's total unique colors to 1/3 of the original since it's going to be downscaled by the browsers anyway to a lower resolution, thus interpolating the colors lost by that lossy optimization process.

As a trial, I managed to trim off a total of ~128 KBs (kilobytes) from all 6 images used in chapter 1.

This is a sample of one of them:
syscall-architecture-differences-indexedlossy-o

And this is the unoptimized version currently used:
syscall-architecture-differences

Should I continue ahead and later submit a pull request for that?

Add bookmarks to the PDF edition

That you for the pdf version. Is it possible to add bookmarks to help jump to specific topics / table of contents in line with the 7 chapters in the article? Thank you.

@ekoome in #11

Faggin made the first *microprocessor*

From dreamcompiler on HN:

"The first mass-produced CPU was the Intel 4004, designed in the late 60s by an Italian physicist and engineer named Federico Faggin."

The first microprocessor (CPU on a single chip) was Faggin's Intel 4004, but mass-produced CPUs existed before that. Earlier CPUs were built from multiple chips, and before that multiple individual transistors, and before that multiple vacuum tubes, and before that multiple relays (although it's fair to say that relay computers were never mass-produced).

Time Slicing Diagram

Chapter 2 discusses how target latency works and provides a diagram for clarification. According to the explanation:

The target latency is the time it takes for a process to resume execution after being preempted...

Based on this explanation, shouldn't the target latency in the diagram start from the beginning of Process 2 and finish at the end of Process 3 designated by the following image?

linux-scheduler-target-latency

It also presents an approach to calculate the timeslices based on a specific target latency:

Timeslices are calculated by dividing the target latency by the total number of tasks.

Suppose we have 3 processes, and the target latency is 9ms, meaning there can be a 9-millisecond gap between the bursts of each process. Using the given method, we divide 9 by 3 and realize that each process can run for 3 milliseconds and therefore must wait for the subsequent processes for 6 milliseconds. This contradicts the definition of the target latency.

in english?

I have many book in english, please write this in Polish or other language
(esperanto)

Reference on the MacOS "split"

https://fahrplan.events.ccc.de/congress/2007/Fahrplan/events/2303.en.html (first attachment; second attachment is the slides) provides a good summary of the behavior of the MacOS (then known as Mac OS X) kernel, XNU, including the memory space provided to processes. It was current as of 32-bit Mac OS X and support of 64-bit process by a 32-bit kernel, but not current with regard to the 64-bit-address-space kernel (AKA K64 in MacOS circles).

So I suggest you include the 4/4 "split" of Mac OS X next to the 3/1 and 2/2 splits found in operating systems of that vintage as illustration, but not necessarily dwell any further, to the extent these splits are less impactful than they once were. Indeed, the main point was to avoid significant memory remapping operations when crossing the userspace/kernel border (except for pre-K64 Mac OS X), but all that went out the window anyway with Meltdown, at which point it was realized keeping the kernel memory mapped while in userspace, even with forbidden access, was not hygienic. Which meant all operating systems were modified to unmap kernel pages when dropping to userspace (and to remap them upon kernel entry), except for a small set of always-mapped pages from which the kernel mappings can be rebootstrapped upon kernel entry, just like pre-K64 Mac OS X.

A sentence phrasing change

https://github.com/hackclub/putting-the-you-in-cpu/blob/366ef51c7137e824596595a2d56ebcc0c67cef71/src/content/chapters/1-the-basics.mdx#L75C1-L75C106

Is there anything wrong with the phrasing of this statement? Should there be something like "makes sure that" after the closing parathesis? I don't know; it seemed a little out of place, so I considered raising an issue here.

Also, it's a great read; I really appreciate your writing here. Being a beginner to system programming and from a non-CS background, this seems like a perfect resource to start off. Still reading it though, hope to complete it soon :)

fix typo suggestion

- Programs can't directly switch privilege levels; hardware interrupts are safe because the processor has been preconfigured *by the OS* with where in the OS code to jump to. The interrupt vector table can only be configured from kernel mode.

Please correct me if I am wrong but I think here this should be software interrupts instead of hardware interrupts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.