To my understanding asm.js is a restricted subset of JS that allows for optimisations to be performed that would not otherwise be possible. In a similar way to how Java bytecode can be interpreted more efficiently compared to a non-compiled language, like Python.
With that in mind, sure you could target asm.js with a high level language that requires memory management, but why couldn't you also target it with a language that assumes GC. JS interpreters have this GC component built in already. Essentially I don't see how it is different from Java + Java bytecode + the JVM, which does perform GC.
asm.js is a subset of JavaScript, and to be easy to optimize, it removes most of the dynamic stuff from JS and leaves a simple, low-level dialect that is basically equivalent to LLVM IR or to C.
You can compile many things to C and LLVM IR, like C++, C#, and so forth. You can compile the VMs of dynamic languages like Python, Lua and Ruby, but compiling them directly would be inefficient - you'd need type checks all over the place. You need a custom VM to be fast on those.
asm.js can run Lua at close to the speed of the normal Lua VM running natively, so this approach is very feasible. But, for JavaScript itself, it probably doesn't make sense, the current VMs are the best that can be done.
However, a subset of TypeScript - without classes, without weird prototype things, just arrays and numbers and computation on those - could be compiled to asm.js. That might be an interesting project to try out.
OK, so are you saying that compiling the language's runtime environment, including the GC, to asm.js would not be efficient enough? That makes sense.
I guess I was hoping that there may be some way to compile the runtime such that the GC wouldn't need to be compiled, and the GC of whatever is interpreting the asm.js code (e.g. SpiderMonkey) could be used instead.
That's roughly what happens when TypeScript gets compiled to JavaScript, it can use the browser's GC. asm.js is used to avoid the browser's GC, either when the original code doesn't need a GC since it uses manual memory management, or it uses a different type of GC from the browser and wants to use that instead (which is effectively the same thing).
The default python implementation(cpython) compiles to bytecode before interpreting(the default c implementation of ruby also does this as of 1.9, before that it did straight interpretation). The difference between compiling to bytecode ahead of time vs at run time is more of a packaging difference and probably not the biggest cause of the speed gap between the languages, although I suppose the python import mechanism doing strange stuff and hitting filesystem too often could slow things down. The reason the most common java implementation(oracle hotspot/openjdk) is fairly fast(there are other fast implementations) is that it includes a fast interpreter written in assembly(although a c++ backup version exists for non-x86 ports) and a good just in time compiler. It also has a much more advanced gc then python. Another thing that makes a difference is that python has a global interpreter lock that only allows one python thread(c extensions can run several of their own threads) to run at a time(despite the number of cpu cores/max native threads) because python's refcounting gc would cause significantly worse single threaded performance if multiple threads were to be run at the same time.
Thanks for the explanation. I am fairly familiar with the Java internals, but I haven't used Python much, and was just using it as an example. I didn't realise that it was compiled to bytecode before interpreting. I suppose it doesn' make sense for any production-ready language to not have at list a JIT compilation system now that I acually think about it.
This is just a temporary limitation of asm.js because it has no access to garbage-collected data. In the future it should have support for managed languages.
> Q. Can asm.js serve as a VM for managed languages, like the JVM or CLR?
A. Right now, asm.js has no direct access to garbage-collected data; an asm.js program can only interact indirectly with external data via numeric handles. In future versions we intend to introduce garbage collection and structured data based on the ES6 structured binary data API, which will make asm.js an even better target for managed languages. [1]
To my understanding asm.js is a restricted subset of JS that allows for optimisations to be performed that would not otherwise be possible. In a similar way to how Java bytecode can be interpreted more efficiently compared to a non-compiled language, like Python.
With that in mind, sure you could target asm.js with a high level language that requires memory management, but why couldn't you also target it with a language that assumes GC. JS interpreters have this GC component built in already. Essentially I don't see how it is different from Java + Java bytecode + the JVM, which does perform GC.
I'm very curious about this.