Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks, particularly for the tip about P-code. Are you aware of any interpreters/assemblers for P-code? The Wikipedia page didn't seem to have any links.


A quick google search pulls up:

https://gist.github.com/r-lyeh/0af42b2788bb75219061 http://scara.com/~schirmer/o/pcodevm/ Here's another ref: http://homepages.cwi.nl/~steven/pascal/book/pascalimplementa...

But really, all you need is an understanding of the ISA (i.e. the instruction set and the machine it runs on), and have a file format in mind for a middle layer and you can build a runtime. (Basically P-code one form of ISA that's supposed to be portable.) That's kind of what insomniac with a custom isa does:

ISA: https://github.com/arlaneenalra/insomniac/blob/master/insomn... ( a little FORTH happy ...)

Assembler Core: https://github.com/arlaneenalra/insomniac/tree/master/src/li...

VM eval loop: https://github.com/arlaneenalra/insomniac/blob/master/src/li...

and Instructions: https://github.com/arlaneenalra/insomniac/tree/master/src/li...

https://github.com/arlaneenalra/insomniac/blob/master/src/li...

It helps to build the assembler and vm backend first so you have a pipeline in play that you can use to add in ops as you discover they are needed.

(keep in mind all my code is toy stuff..)


Well, since we are talking about P-code and all...

I recommend you look at Oberon. I've avoided the Wirth family of languages my whole life, but have been messing with Oberon off and on since winter, because it's interesting from a security standpoint. (It was supposed to be my fun Advent hacking project, but work changes and living changes—i.e., moving—caused interference.) Project Oberon is interesting because it involves a language, a system, and a machine, from the ground up, all created from scratch.[1]

Often, when trying to dive deep on some concept, the available literature can get you rolling with a toy (e.g., compilers), but it helps you reach only a facile understanding, and punts on everything around it. You'll be aware of this; I'm pretty sure it's what you're referring to in your comment above. That's mostly avoided with Oberon, because it's a full-fledged toolchain for quasi-real-world use—at least it was in production use at ETH Zurich.

There are some gotchas with Oberon, and it mostly comes down to a lot of vague, hypey comments written by people who haven't dived deep, and don't have the level of understanding that their comments suggest. There are numerous examples. I could write those up, but here's one: "It was all done without resorting to assembly anywhere." Then you go look into it, and that's because there's no assembler, and it's inline snippets of hex-encoded machine code and other binary blobs instead.

The second big gotcha is that Wirth & Co have produced volumes of (what looks like high-quality) literature, but a bunch of it is either out of date, only superficially helpful, poorly written, or contains errors. For example, "Oberon" refers to so many things—including systems and languages that Wirth had nothing to do with and probably should have never been allowed to bear the name—that it makes jwz's old Java rant[2] seem quaint. (Try starting out at the Wikipedia page and making sense of Oberon's evolution or mapping out the family tree, then try referring to primary and secondary sources directly that might clear things up. Good luck.)

I began with a fresh notebook for taking notes and keeping track of errata in the published stuff. I quit keeping track of errata after two days and several chapters, because it was too much. If you're interested, I highly recommend just running a system image from Peter De Wachter's Norebo[3] and using his emulator for Wirth's RISC machine[4]. Familiarize yourself with the basics of how to use Oberon-the-system by playing with it for an hour or so, crack open the source and just study it directly. Cross reference Wirth's publications if you want (they're all online), but assume that they're lying about something. I can also share my notes. Stay away from the mailing list, it's populated by USENET-style cranks, and it isn't really an essential component of Oberon development. There's not really a community—Wirth pretty much does his own thing, never posts there, and just does a source dump through his personal website.

Having said all that, studying Oberon won't impart all the knowledge you're looking for. It's in this weird place where it's more than a toy, but it really doesn't directly resemble any of the real-world systems that you're actually interested in. (Which most likely means UNIX; let's just be honest.) But it's probably the kind of stepping stone you need.

So the best resources on ELF I know of are the articles written by Eric Youngdale for Linux Journal[5][6], from back in the 90s when vendors were adopting ELF for the first time and he wrote the Linux implementation. I believe this to be the highest quality treatment of the subject that exists (at least as of a few years ago when I was interested in studying this kind of thing).

Hope it helps.

1. https://issuu.com/xcelljournal/docs/xcell_journal_issue_91/3...

2. https://www.jwz.org/doc/java.html

3. https://github.com/pdewacht/project-norebo

4. https://github.com/pdewacht/oberon-risc-emu

5. http://www.linuxjournal.com/article/1059

6. http://www.linuxjournal.com/article/1060

EDIT: I forgot the P-code tie-in! P-code is tangentially related to Oberon because it was developed to port/run (a dialect of) Pascal, one of Oberon's predecessors and Wirth's main claim to fame. Here's a sort-of P-code interpreter for Oberon—it actually runs a (fairly capable) subset of the RISC ISA that Oberon proper targets:

https://www.inf.ethz.ch/personal/wirth/CompilerConstruction/...

(Yes, that's the entire implementation. You'll need a compiler for it though. That can be found in the parent directory. To see what a more "fortified" implementation would look like, and implemented in C, look at Peter De Wachter's emulator, already mentioned above.)


This may not be authoritative enough, but I found John Levine's "Linkers and Loaders" to be a very informative guide to the general concepts of linking, including the various architectures like ELF and COM.

http://linker.iecc.com/

Edit: not that they're conventionally called "architectures" or that COM has any structure to it.


Thanks a lot for that extended meditation! I love learning about the history of things, even when it isn't directly useful to my immediate problem.


Don't link to jwz.org from HN :)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: