Jump to content

Python CPU vs GPU(Cuda)


Recommended Posts

Hi, I'm an old core4d user but I've never posted anything before, I'll introduce myself here and apologize for my bad English. My name is Gianluca (Abetred) I am present on other platforms and I consider myself a problem solver and an experimenter in the fields of 4D cinema, xpresso, thinking particles and python. I have an unhealthy passion for particle systems and related systems. I would like to introduce you to some experimental projects that I have been carrying out for some time to force the use of CUDA acceleration to manage intensive computational calculations, usually entrusted to the CPU through third-party libraries.


I would like to point out that the tests refer to pure python code and only accelerate the datasets passed to the cuda function, the python code remains single core and unfortunately has its limits. The first project (test) is the 3d flocking of TP interactive entities and the attached video is not the final result (which has much better performance) but an intermediate point.

Flocking simulation parameters:
Separation: Boids avoid mutual collisions by maintaining a pre-established minimum distance.
Alignment: Boids tend to match the speed and direction of their neighbors.
Cohesion: Boids cluster toward their group's center of mass.
Boundaries: Boids remain within the simulation domain with natural behavior.
Speed Limits: Speed is limited within a preset range for realistic movement.
Prey: Boids can track moving targets relative to distance.
Obstacles: Boids interact with obstacles in a realistic way.
All parameters are adjusted by response factors to recreate a smooth reaction.
Further implementations are possible, such as the tracking ray of complex obstacles which consists in predicting the most reliable escape route while avoiding collisions, the direction of travel which maintains an axis of rotation in the direction of acceleration and the arc of vision which limits the interactions only with visible boids within a circle section, for greater realism of the simulation.

 

The second test is developed around a surface of points simulating a moving fluid...

Some significant data on the scene:
- The "liquid surface" is a mesh composed of 40,400 points, without faces or segments.
- The spoon interacts with the surface through a selection of about 600 vertices (but there could be arbitrarily many more without
excessively burden on performance).
- The VertexMap is generated for visual feedback only; the actual calculation is handled internally by the code.

The code is executed in real time by a Python Generator and is responsible for managing the interaction between the two objects, calculating the effect of the waves in the fluid, calculating the positioning of the points.

This test is also in the full experimental phase.

 

I point out that in the following video the cuda acceleration refers to the first part of the video, the one indicated as "raw simulation" and that already applying the data to the vertexmap slows down performance because it is managed by the CPU.

 

 

 

But let's go further...
The revolution is evident not only for the simulation in viewport. The real advantage lies in the construction of the cache (I'll give the example with the alembic export or a trivial writing of the dataset in a file), which is also calculated by the GPU resulting in a real time saver.

 

Thanks for your attention, I will try to keep this topic updated...

Link to comment

Thank you Stefano, your words fill my heart with pride and my mind with ideas! I

 

11 hours ago, Hrvoje said:

Would love to see an update for this topic

promise I'll go into more detail progressively proposing tests and improvements achieved

 

Link to comment

Before moving on to anything else, I wanted to share this soft repulsion test:

The "particles", which, I want to remember, are mesh points, seek a position while avoiding any collisions with each other.

 

It is based on the first test (TP Real-Time 3D flocking on GPU) in which however I used Thinking Particles and the dataset was not dynamic.

 

Here 30 particles per frame are generated by the code, for a total of 15000 particles, the CUDA function takes care of calculating their position with respect to the others and generating a soft repulsion. Also calculate the reaction to the presence of a pivot.

 

I tried to keep the code lightweight to show "clean" performance (as much as possible).

 

In conclusion we can see how the GPU works in the same way also for still export (or any other case of creating a cache).

 

 

 

 I'm still working with Numpy (it will be totally replaced by CuPy in the future), experimenting with CUDA block synchronization and using shared memory.

Link to comment
  • 4 weeks later...

The more I learn about CUDA, how to exploit its potential and what its limits are in the Python C4D context, the more the number of particles I can involve in the simulation grows.
This scenario describes a simulation of interactions between particles subject to the law of universal gravitation (doped) in a 3D (or 2D) grid in which each particle interacts only with its neighbors, bringing the computational complexity from O(N²) to O(N) → (N*k) where k is the number of particles contained in the 26 adjacent cells + one (the one containing the particle under examination) in the case of a 3D grid and 8 cells + one in the case of a 2D grid.
In this video you can't see realtime performance in viewport but you can understand what can be achieved from the code I'm working on.
Let yourself be enchanted by the "physics" of particles when there are so many of them...

 

 

Link to comment
  • 2 weeks later...
  • 2 months later...

While studying the fascinating concept of density (gradient),
I thought about applying what I learned to the previous project and I must say that I really like the result.
Naturally in an absolutely primordial state, in this test I tried to obtain a sort of effective fluid dynamics from a performance point of view by feeding "only" 20,000 particles. Currently the code produces good fluidity (with adjustable viscosity) and a taste of surface tension, in the future I will try to refine the system!

 

 

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • LATEST ACTIVITIES

    1. 1

      Cinema 4D 2025.1 ❄️ Winter Release 2024

    2. 1

      Cinema 4D 2025.1 ❄️ Winter Release 2024

    3. 13

      Multi Gizmo (Move, scale, rotate in one)

    4. 7

      Looking for Python Script : Create Instances from multiple Selected objects

    5. 13

      Multi Gizmo (Move, scale, rotate in one)

×
×
  • Create New...