Pointing at a big company and saying "they're doing it wrong" is easy enough to ...

mindcrime · on Feb 3, 2012

Single rooted tree. Separated repositories would make it harder to share code, leading to more dupication.

I'm not convinced that the difference between a singly rooted tree and a multiple-rooted tree is going to make that much difference. I mean, think about it... if you 100k's or even millions of files, is anybody going to parse through all of that, looking for a reusable function, even if it is on their workstation?

And sure a compiled language would catch naming collisions on functions or whatever, but nothing stops somebody from creating a method

doQuickSort( ... )

and somebody else creating

quickSortFoo(...)

where they are semantically equivalent (or very nearly so).

It seems to me that the problem of duplicating code, because you don't know that a method already exists to do what you're trying to do, is the same problem regardless of how your tree is laid out; and is ultimately more of a documentation / process / discipline issue. But I'd be curious to hear the counter-argument to that...

EricBurnett · on Feb 4, 2012

is anybody going to parse through all of that?

Yes, in fact. We have some great tools that give us full search over our entire codebase (think Google Code Search), and you can add a dependency on a piece of code without needing to have it on your workstation already. The magic filesystem our build tools use knows where to get it and can do so on demand. Combined with good code location conventions, an overall attitude that promotes reuse over rewrites* and mandatory code reviews where someone can suggest a better approach, we do a pretty good job. Not everything is eliminated, of course, but I'm pretty happy with the state of things.

To your example, we'd use the STL for most of our sorting needs, but if you were to want, say, case-insensitive string sorting, I can tell you where to find it (ASCII, UTF8, or other). If you want a random number, any RNG you could want is available. Most data structures you could name have been written and tested already. Libraries for controlling how your binaries dump core, command line flags are parsed, callbacks are invoked, etc etc are readily available. We really do reuse code as much as possible, and it's wonderful to have ready access to all of this whenever you could ask.

*At a method level, anyways...we're famous for writing ever more file systems ;).

EricBurnett · on Feb 4, 2012

Reply to mindcrime sibling post

I've seen that before myself at other companies, and it's a shame. A healthy codebase is an investment in the future - if you're not taking the time to cultivate it you're sacrificing long term usability for short term gains. The larger the codebase the more difficult the task, of course, but for us that's just an excuse to solve the next hard problem :).

One more good link on the topic: our use of Clang to find and fix bugs in our existing codebase, as we find new classes of 'gotchas'. http://google-engtools.blogspot.com/2011/05/c-at-google-here...

mindcrime · on Feb 4, 2012

Awesome, glad to hear you guys take reuse so seriously. I'm a little surprised, only because - in my experience - so few organizations put in the effort that you guys do.

yew · on Feb 4, 2012

Knowing that somebody is taking the effort to get this sort of thing right is really, unspeakably awesome. Though I guess not really, given that I'm posting it. This sort of thing is one of the main reasons I read HN.

Hopefully the methodology will filter out into the wider world one day. . . Anyway, thank you for posting it!