Multithreading into modeling tasks

Brought to you by: ljsails, peastman

#384 Multithreading into modeling tasks

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2017-09-16

Created: 2017-09-13

Creator: Pete

Private: No

Hi.

I have been wondering what it would take to bring multithreading into certain cases in modeling. Generally, when you do 3D-modeling, you do one thing at a time and there is really nothing for multiple threads to work on. However there are cases when the software is doing quite vit more than just "repeating, what the user just said".

On of these cases would be regenerating the renderingmeshes of a boolean. Some time ago I created a boolean object that consisted of some 42 or 43 original objects, mostly primitives. At it's final stages it began to take minutes to update the model after the smallest changes. That of course was becasue ALL the rendering meshes are regenerated from scratch and then comined recursively though the entire model. One thing that would hep would of course be, if the rendering meshes were cached. The other, that the modeling hierarchy should be kept as much tree-like as possible (as in oppisite to a chain of operationn) and then working the regeneration in multiple threads starting from the tips of the branches.

Another and much more straight forward job would be creating, modifying and deleting large arrays/groups of objects. Making a ten thoudsand copies of a tree leaf can take several minutes and (surprise!) deleting them takes basically just as long.

I don't know how far this idea could be taken.... Could it be made to work for instance in subdividing triangles?

From scripting point of view, it'd be nice to have an interface for a multithreading engine, made so simple to use, that you could just throw the case in and let the engine work it out, somehow like

jobToDo(int howMany)
{
    for (thisMany = 0; thisMany < howMany; thisMany++)
    {
        // independent jobs
    }
}
new MultiTasker(jobToDo(10000));

The recursive case of deep booleans would of course require a different approach...

Discussion

Peter Eastman - 2017-09-14

You mean like ThreadManager? :)

In principle booleans could be multithreaded, though as you say it would require reformulating them as a tree. I'm curious why creating/deleting large groups of objects is slow. Are they live duplicates? If so, I can see why that might be expensive. Creating live duplicates should be very fast though. And I don't see any good reason for deleting to be slow. Do you have an example that demonstrates it?

Subdividing meshes would be hard to parallelize. Creating the RenderingTriangles would be easy to parallelize, but I doubt that's a major bottleneck.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter Eastman - 2017-09-14

I created a simple triangle mesh, set its smoothing method to approximating, then brought up the Array tool and told it to make 1000 copies. That took about 1-2 seconds, whether or not I used live duplicates.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pete - 2017-09-14

1000 copies

How about 10 000? I think we are talking about problems in different magnitude. I've got a strmap script somewhere, that read the data from a file and creates a "stamap" based on that. A few hundred of the biggest were individuals and the rest live diplicates of about 10 ro 20 templates in different colors an brighnessses. -- I don't remember how many starts there were but that was slow both to create and to delete.

The process somehow gets slower, when you keep adding objects to the scene. Allocating all the available memory for use does not really change that.

To verify, try running this script a few times in a row and see, what happens to the time.Then delete, what it created and change amount to the sum of the previous runs and run it again. Let's see if your results are anything like mine. :)

// Create <amount> spheres and // measure the time used on it. sph = new Sphere(0.02,0.02,0.02); amount = 2000; rnd = new Random(0); y = -0.5; t0 = System.currentTimeMillis(); for (i = 0; i < amount; i++) { x = rnd.nextDouble()*10.0-5.0; z = rnd.nextDouble()*10.0-5.0; window.addObject(sph, new CoordinateSystem(new Vec3(x,y,z), 0,0,0), "sphere_"+i, null); } t1 = System.currentTimeMillis(); t = (t1-t0)/1000.0; // ms to s tpp = t/amount; println(t + " s\t" + amount + " pcs\t" + tpp + " s/pc");

Last edit: Pete 2017-09-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter Eastman - 2017-09-14

Here are my results from running it five times:

2.733 s 2000 pcs 0.0013665 s/pc 4.223 s 2000 pcs 0.0021115 s/pc 5.597 s 2000 pcs 0.0027985 s/pc 6.88 s 2000 pcs 0.00344 s/pc 7.757 s 2000 pcs 0.0038785 s/pc

It gets slower as the number of objects in the scene increases, but nothing dramatic. Deleting all of the them then takes about 30 seconds.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Luke S - 2017-09-14

Peter, you must have really good single-thread performance. My typical run looks more like this:

7.36 s 2000 pcs 0.00368 s/pc 8.625 s 2000 pcs 0.0043125 s/pc 11.078 s 2000 pcs 0.005539 s/pc 14.266 s 2000 pcs 0.007133 s/pc 16.625 s 2000 pcs 0.0083125 s/pc 19.407 s 2000 pcs 0.0097035 s/pc 77s total 28s select all 2:30 to delete 44s for 12000 items in single script run.

Right now, I'm running the Array Tool. 2 000 items take ~3 seconds. 12 000 items have been running for ~20 min now, and seems to have stalled? Not a memory issue.

The thing to keep in mind about the script results is that the objects are being added one at a time through the api. each addObject() call adds the object to the itemTree as well, which causes a full update of the TreeList state, including a repaint. I suspect that a lot of the measured time is spent in these stages.

Pete and I are using Windows, so by default, Java tries to accelerate UI elements through DirectX, which is sometimes... Icky. I'll have to do a little more profiling to see if that is a factor.

Perhaps we shoud consider 'bulk add' options for the API?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter Eastman - 2017-09-14

That suggests an easy solution: defer updating the UI until later, and use an ActionProcessor to discard redundant updates. Let me try that and see how it works.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter Eastman - 2017-09-14

Works beautifully:

0.176 s 2000 pcs 0.000088 s/pc 0.302 s 2000 pcs 0.000151 s/pc 0.487 s 2000 pcs 0.0002435 s/pc 0.652 s 2000 pcs 0.000326 s/pc 0.838 s 2000 pcs 0.000419 s/pc

Now to look into how to optimize deleting.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter Eastman - 2017-09-14

I just posted a pull request that should make things a lot faster when working with very large numbers of objects: https://github.com/ArtOfIllusion/ArtOfIllusion/pull/59.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pete - 2017-09-14

Great!

I tested it and it does just that. The trend of slowing down seems stronger on this list than on yours.

0.109 s 2000 pcs 0.0000545 s/pc 0.297 s 2000 pcs 0.0001485 s/pc 0.5 s 2000 pcs 0.00025 s/pc 0.672 s 2000 pcs 0.000336 s/pc 0.907 s 2000 pcs 0.0004535 s/pc

That's ny home PC. Earlier my results at home were quite similar to Luke's. At work, with a brand new (brought to me yesterday * ) PC with a 4GHz processor, closer to Peter's.

One thing I noticed, that creating 10000 spheres at once used to take about 20% longer than creating 5 x 2000. Now that difference seems to be gone. Now actually (with some 10000 spheres) selecting and deleting both take about 3 times as long as creating them selecting being a bit faster. With just 1000-2000 you hardly notice the time.

( * Actually one reason, why I started to think the multithreading thing was an advice I got drafting the spec for the mentioned computer: "Put the money into clockspeed, the software can't utilize muti-cores..." )
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Luke S - 2017-09-16

Looking like a great improvement in that use case.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.