Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In the implementation of Ruby that I work on, TruffleRuby, we've been exploring lazy parsing, where the parser will find a method but not fully parse it until the method is called for the first time. I wonder if there's any other modifications you could make to the VM itself to improve startup time.


The particularly frustrating thing, when I've started thinking about optimizing boot time at a VM level, is that it's near-impossible to "understand" what loading a file actually does, since it's all just evaluated in a single namespace.

It would be great if we somehow had a way to load a module-as-file without unknown side-effects, and without depending so deeply on the other contents of the global namespace.

But this is basically describing a complete overhaul of most of what makes ruby ruby, so... ¯\_(ツ)_/¯


Yes if there were special Ruby source files that only had classes and modules at the top level, and only defined nested classes and modules and methods in those, then it would be a lot quicker to load things.


Yep. But even then though, what if:

    class A
      B = "c".freeze
    end
And elsewhere:

    class String
      def freeze
        raise "because I can, that's why"
      end

      # or even method_added, TracePoint, ...
    end
It feels like something should be possible here, but it's really steep uphill battle.


Right - that's why I said nothing but methods and nested classes - expressions in method bodies would be disallowed. And tracing and method added hooks and so on, yes.

You could say it was a separate language .rb-module or something, to make it formal.


> Right - that's why I said nothing but methods and nested classes - expressions in method bodies would be disallowed.

I presume you mean "expressions that aren't method definitions in class bodies" (though that's a problem, because of things like attribute declarations) rather than "expressions in method bodies", since methods with no expressions in their bodies would be pointless.


Oh, huh, yeah, that makes sense. That would totally work. Cool idea.


I'm also planning to add lazy parsing to perl. Do you store the whole string of the method body or do you mmap your source files, and store only the mmap ptr, offset and length for the body?


How difficult is lazy parsing for Ruby? How much parsing do you need to do just to find where the method body ends?


You basically have to do all the parsing, but you can delay creating the actual AST and other data structures like byte code, which for us is the really expensive bit.


Makes sense, thanks. So you end up redoing most of this (parsing) work later when you want the AST, but a) you might not have to redo it at all if the method's never actually called, and b) it's not the expensive bit anyway?


Yes.

But if you have source code that you know you will likely be requiring such as the standard library, you can do the initial parsing while compiling the Ruby VM, so you don't end up doing the parse twice at runtime.

Long term what we hope to do is to provide a build of the Ruby VM that includes the version of Rails you are using pre-parsed.

And then longer term we'd like to actually fully parse and initialise Rails (run the top level of the files which are loaded) during compilation, and freeze the heap and store it in the Ruby VM executable. When you run this special Ruby/Rails VM the Rails code is simply mmapped into your address space with all objects initialised and ready to go.

Obviously it'll require some tweaking to delay doing this like starting the web server so that doesn't get run during compile time.


We've talked about trying to implement this strategy (load everything, dump/restore the heap) with MRI, but our thought experiment was on the scale of a fully-booted application. In that context, it gets difficult to determine how to proceed when an application source file has changed, since re-loading isn't safe in ruby.

It's a really interesting idea to pre-load the heap with just a set of libraries, which wouldn't be subject to as much change.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: