It also depends very much the geometric complexity of your scene. With hundreds of millions of polygons it's not difficult for raytracing to outperform rasterization, especially if most of those polygons are instanced.
Indeed, with hundreds of millions of polygons, a rasterisation method will generally have to splat them all onto the screen one by one (minus some clever occlusion pre-processing). By contrast, a ray-tracer has the ability to shove all the objects into a R-tree or kd-tree, and efficiently search for only those objects that intersect the ray, and produce the objects in guaranteed order of distance from the camera.
Yes, but on the other hand rasterization is implemented in hardware on GPUs which gives it a great performance advantage. Also, sorting before rasterizing allows most triangles to be discarded before any fragments are created. Besides, creating and traversing an acceleration structure does definitely not make ray tracing free. Especially traversal is not exactly cache-friendly. Also, rays passing closely by geometry, but not hitting it, will traverse rather deep into the tree and then backtrack, which is rather costly. Another advantage of rasterization is that it is data-parallel at the vertex level. The disadvantage is that it is far less flexible in what you can render compared to ray tracing; it's only really practical for camera rays.
I'm a bit sceptic of this claim. You can also produce acceleration structures for reasterized polygons, create hierachical level of detail representations of your scene and render whatever LOD is necessary. This reduces the number of polygons that have to be rendered considerably. It always seems like the claim that raytracing is faster for tens of millions of polygons due to acceleration structures misses the point that accelerations structures can also be applied to rasterization.