Saturday, March 31, 2012

Issues with port steersuite to CUDA and some possible solutions

I have been trying to port SteerSuite code to CUDA for about an week. The largest obstacle with this is the inter-class referred the data structure.

In the class GridDatabase2D, the pointer _basePtr points to the overall database which stores every single item we need to draw and move, and the pointer _cells points to a GridCell array which stores the pointers of all the items according to those items' location.

But in the class SimulationEngine, there is a list of agent pointers and a list of obstacle pointers, which are the basic items in the whole database. And the updates are done via the list of agent pointers. In agent module, there is global pointer pointing to the GridDatabase2D, so each agent can update the database.

Adding items, either Agent or Obstacle, is kinda different. In TestCasePlayerModule, test case file is read here and parsed, then obstacles and agents are added in to _obstacles and _agents respectively. But agents are added into the database in a different way from the obstacles, via createAgent method in SimulationEngine by resetting each agent.

The interplay among those classes complicate the porting work, so I am planning not to reuse the updateAI code directly but to copy the underlying data, _basePtr and _cell out, process them in the GPU, and copy back afterwards. The workload is quite a lot, but this seems to be the only way unless I can get a better graphic card which supports CUDA with compute compatibility 2.x. Then I can reuse the C++ code directly, if CUDA can fully support the C++ features used in the SteerSuite.

Wednesday, March 28, 2012

SteerSuite underlying data structure analysis

Before porting the SteerSuite to CUDA code, I have to fully understand how SteerSuite works.

From http://www.magix.ucla.edu/steersuite/, you can download the code.

There are seven projects in the SteerSuite solution, pprAI, simpleAI, steerlib, steersim, steerbench, steertool, and glfw. Most of the work will rely on the first three projects which includes the underlying data structure storing agents and obstacles, and how each agent move with the help of the AI.

First of all, all the data are stored in the GridDatabase2D class, in the form of an array of SpatialDatabaseItemPtr. And each item can either be an agent or an obstacle, depending on the initial options. Beside this array, another array of GridCell stores the identical database, but only the pointers to the original data. The whole geometry is divided into this array of GridCell, and if agents and obstacles are close enough, their pointers will be store in a GridCell.

Every time, each agent is updated by using corresponding AIs, either simpleAI or pprAI. First step, each agent will read its neighborhood information from the whole database, then do the corresponding computation, at last, the new position will be written back to the agent, and the content of GridCell will be changed as well as the position changes of the agents.

By examining the simpleAI, pprAI, and steerlib projects, there are six subclasses of SpatialDatabaseItem, i.e. SimpleAgent, PPRAgent, BoxObstacle, CircleObstacle, and OrientedBoxObstacle. Now I will only take care of SimpleAgent.


In SimpleAgent, there are _position, _velocity, _forward, _enabled, _radius, and _goalQueue as the class members. Because some of them uses self-defined data like Util::Point, Util::Vector and standard template library like std::queue<SteerLib::AgentGoalInfo>, I have to use float3 and array instead.

The basic routine will be that first copying data from host to device, then updating each agent in parallel at device side, finally copying the modified data back.

If this method does not work well, it may be worth trying rewrite database part completely in CUDA. Let us see.

Sunday, March 18, 2012

Integrate CUDA into SteerSuite

SteerSuite is a pure C++ project. I need to do something in CUDA in this project. So first thing to do is making CUDA code run with this project.

By referring this post: http://www.ademiller.com/blogs/tech/2011/05/visual-studio-2010-and-cuda-easier-with-rc2/, I successfully make CUDA run with SteerSuite, more accurate, with steerlib. Because I am planning first to optimize data retrieve in GridDatabase.

The way of integrating CUDA code into existing VS project is fairly easy, and it can be done in following steps:

1) Select the project in the solution explorer (here I choose steerlib), and then select  Project--Build Customization menu. In the dialog, check CUDA 4.0 targets.













2) Then right click on the .cu file and select Properties. Make sure that in Configuration Properties--General, "Item type" is CUDA C/C++


3) You should make sure that NVCC CUDA compiler targets your original platform, either Win32 or x64.
In project's Properties, open Configuration Properties--CUDA C/C++. The "Target Machine Platform" is correct. Here my target is Win32.

4) Open Configuration Properties--Linker--Input, and add cudart.lib to the list in the "Additional Dependencies", and do not forget use semicolon(;) to separate each item.


Now, you are all done. Just rebuild your project. But wait, how to use CUDA code in the source code.
You just need to use extern "C" to decorate your functions in .cu file, and use it in your original project code, but do not pre-declare your function in at the head of the file in which you use it, in the same form as extern "C".

I followed the steps mentioned above, and run a simple function in .cu file. Hopefully more complicated ones will work as well.

Wednesday, March 14, 2012

Crowd Simulation with SteerSuite starts


Crowd simulation is the process of simulating the movement of a large number of entities or characters, now often appearing in 3D computer graphics for film. While simulating these crowds, observed human behavior interaction is taken into account, to replicate the collective behavior. 

In the final project, I am going to use SteerSuite, which is developed in UCLA, to show my crowd simulation. SteerSuite provides a framework to develop AI for steering objects in the crowd simulation. In the project, I will modify the built-in pprAI which is based on the paper “A modular framework for adaptive agent-based steering” from CPU-based code to GPU-based code, and observe the performance boost brought by the GPU acceleration. 

In SteerSuite, there are mainly three steps for each agent to move towards its goal. First, the agent will read environment data from database, this stage is also called perception phase. Then the agent will analysis the situation based on the received environment data, this stage can be subdivided into prediction phase and reactive phase, calculating possible collision and steering respectively. At last, the steering result will be written back to the database, and the movement will be rendered by the graphic library. 

In the three steps described above, parallelism can be exploited in each of them. We can parallelize reading environment data, do the analysis simultaneously, and writing data back in parallel. I will apply these parallelization one after another and uncover the acceleration brought by these changes.
At the end of this project, we should be expected to see certain amount of speed-up brought by GPU over CPU. 

 
Reference:

Crowd Simulation: http://en.wikipedia.org/wiki/Crowd_simulation
SteerSuite: http://www.magix.ucla.edu/steersuite/
pprAI algorithm: http://dl.acm.org/citation.cfm?id=1944769