Friday, April 20, 2012

Finally SteerSuite works with GPU

Because of the limitation of CUDA C++, I dropped the original plan that put everything in GPU side. Instead, I only copy the necessary information, like position, velocity, direction, and so on to the GPU side.

The data structure is as follow:
typedef struct cuda_agent{
bool _enabled;
float3 _position;
float3 _velocity;
float3 _forward;
float _radius;
float3 _goalQueue[20];
int _goalQSize;
int _curGoal;
int _usedGoal;
AABox _oldBounds;
AABox _newBounds;
}cuda_agent;
First, I copied the data back and forth between CPU and GPU, however, I found it very inefficient. Therefore, I keep 
the updated data only in GPU side. After computation along with time, I copy the updated data back to CPU and draw 
them. And this make it run faster.

 Currently, I compare the performance between i7-2600 and GTX520, and they just have the same performance. I guess
it is because the computation ability of GTX520 is too weak. I will rerun the SteerSuite in other computers which have
better GPUs. 

I think I can get better results.

1 comment:

  1. There are machines with several different cards available in the SIG lab. Also, dig deeper into your performance analysis. Is there enough parallelism? What is the computational intensity, i.e., how much compute are you doing relevant to memory access? Are too many registers being used? etc.

    ReplyDelete