Sorry for the novel here. My team is nearing the end of post on our current film, and A:M's renderer has been on my mind a lot recently. I'd like to frame these notes as constructively as possible.
Since it's gotten so long, I'll throw in some Big Headings for readability--and in hopes of making myself feel really important.
All of my films and many of my illustrations have featured A:M in one form or another. It's my utility knife app. I continue to find it an extremely powerful, user-friendly full stack modeling/rendering/animation package at an admirable price.
It's beginning to show its age though. A 3D package is only as good as its output options. Moore's Law is dead. We're not going back to ever-faster single core CPUs. Absent third-party renderer support, everything we do in A:M has to go through the bottleneck of a raytracer that can only utilize one processor core. When Netrender was included with the base package (in v16, six years ago) I took it as a stopgap solution and admission that the single-thread renderer was becoming a problem. Looking at the v19 roadmap, I still see nothing about multicore rendering. According to Activity Monitor, raytracing in A:M launches one additional thread, with a negligible increase in memory usage. My current production machine, a laptop, has 8 cores.
Netrender, with respect, was never intended to be an artist's tool--which A:M always has been. It's a TD tool for idle overnight labs and rackservers: overly complex, crash-prone, difficult to use, and less convenient than the multiple applications of the Playmation era. I personally have never even gotten it to work.
Third party renderer support is attractive in some ways, but aside from introducing many of the same pain points as Netrender, it would mean writing translation code for all of A:M's output options--not only Hash patches, but procedural textures, hair, volumetrics, soft reflections, IBL, ambient occlusion, etc. etc. etc. As much as I'd love to see COLLADA import/export eventually, we're talking about a very large undertaking, better implemented piecemeal over a longer period of time.
Keeping it Realistic
It's been my observation that an A:M window at render is basically a "state" machine--anything that changes between frames or render passes is applied to the world state and then read back into the renderer. (You'll see soft lights literally shift position between antialiasing passes, for instance.) As such, simultaneous rendering of different frames/passes becomes a slow, crash-prone and inefficient process of repeatedly loading multiple, near-duplicate instances of the same scene.
More realistic would be to render a single pass at a time as now, but split it into tiles (say 128x128). A separate rendering thread is spun up for each 128x128 square and fed into a queuing/load-balancing library like OpenMP. The world state is maintained by the main thread. The queue completes (with an acceptable bottleneck at the last thread rendering), the tiles are assembled in a new thread while the main thread is allowed to advance, and a new queue is spun up for the next render pass.
I notice that some great work has already been done in v18 at handing off render pass reassembly and other post effects to the GPU. (Today even a netbook's integrated graphics will composite faster than a single CPU thread in most cases.) Pass compositing, post effects, and file I/O should be able to manage without reference to the main thread, so need not hold up work on the next pass/frame.
Digging through the forums, I see that a multithreaded renderer was attempted in v14--splitting the image into horizontal strips in that case--but was abandoned because it produced render artifacts in complex scenes. I can only go by my own war stories to guess what might have caused them; many of you will remember the memory-limited old days when illustration-quality renders required "scanning" across the image with a zoomed in camera or obstructing different parts of the scene sequentially, rendering multiple frames, and reassembling the pieces in Photoshop. Post effects like glow and lens flares were obvious snags with these methods--but again, post effects are better handled on the fully reassembled image, preferably in a separate thread. Single-pass antialiasing, especially at the frame edges, could occasionally be a troublemaker, but overlapping the render tile edges a few pixels fixed this. Single-pass motion blur and depth of field rarely produced good results; the time savings of saturating more than one core (even on a low end system) would to my mind argue for their potential retirement if need be. Phong soft shadows were always a little finicky--they saved me on "Marboxian" with a 500Mhz PowerPC, but I'm not convinced retiring them would cause much pain today. I have very little experience with the toon renderer, so I can't even attempt to comment on how well it might parallelize.
2001 Miles to Futureburg
With a multithreaded renderer, A:M's biggest bottleneck is ameliorated. The power available to users increases immediately, and begins to scale again with each hardware generation. Tile size and priority optimization can be tweaked in subsequent releases. More tasks that are suitable to be handed off to the GPU, like Perlin noise, can be experimented with in their own time. With all the work that's been done to date on control point weighting, COLLADA import/export of rigged, animatable polygon models becomes more and more realistic (bare bones and incomplete at first, but with plenty of time to improve). With effortless boolean animation, riggable resolution-independent curved surfaces, hassle-free texture mapping, particle/hair/cloth/flock/physics sim, volumetrics, image-based lighting, AO, radiosity, 32bit OpenEXR rendering and much more, A:M becomes an app indy filmakers and small shops can't afford not to plug into their pipelines.