Thursday, 13 March 2014

Parallel Dijkstra

Map 1

Map 2

Map 3

Map 4

Map 5

Map 6

The parallel dijkstra application has been tested on the original 5 maps used on the sequential dijkstra and the diffusion applications. For each map the parallel version found the same path as the sequential version.

Map
Parallel (ms)
Sequential (ms)
1
8.42
0.97
2
33.82
1.01
3
40.78
1.05
4
55.87
1.00
5
18.58
0.64
6
51.91
0.95
Time for each map

Looking at the times taken for each map we can see quite a large difference between the parallel and sequential times. Even between the differing maps there is a large variation for just the parallel times, whereas the sequential times remain fairly consistent at around 1ms.

This is due to main while loop of the dijkstra application, where loop until the queue is empty and all nodes have been evaluated. 

Map
Parallel
Sequential
1
32
1024
2
132
1024
3
159
1024
4
219
1024
5
72
701
6
203
1024
 Iterations to evaluate all nodes in queue

Looking at the number of iterations needed in the sequential version we can see that it remains pretty much consistent taking 1024 iterations to evaluate each node in the queue, interestingly this is the total number of nodes added to the queue as we have 32*32 nodes and we evaluate each one. (Note: map 5 we only add 701 nodes to the queue as a number of nodes are blocked.)

However looking at the parallel version the number of iterations jumps between 32 and 219 for the differing map sizes. This is due to the frontier function and the way in which the parallel application works. 


The parallel application determines which nodes in the queue can be settled at the same time and adds them to the frontier set. Any node in the queue which has a distance of less than the minimum distance + 1 will be added to the queue. 

In each of the maps a lot of nodes have varying costs, these varying costs between neighbours will effect which nodes in the queue can be added to the frontier set and settled at the same time, the more variation between nodes the more iterations it will take to find the path. In each instance the worst case should be 1024 iterations for the 32x32 grid.

This also helps to explain why we see a speed up for the larger graph sizes, while the CPU is able to process 1024 iterations faster than the GPU can handle 32; for the large graph sizes the sequential version will need to take nxn iterations so for a 2048x2048 grid this would be 4,194,304 iterations. 

Map
Parallel
Parallel Time
Sequential
Sequential Time
32x32
32
8.34
1024
1
64x64
64
16.37
4096
3.8
128x128
128
32.52
16384
16.2
256x256
256
65.31
65536
68.01
512x512
512
140.52
262144
277.07
1024x1024
1024
399.74
1048576
1146.38
2048x2048
2048
1745.79
4194304
4954.72
Iterations and time for varying grid sizes

As we can see when each node has an equal cost the parallel version only takes n iterations for an nxn grid but take n*n iterations for the sequential version. However as we can see if we were to introduce terrain costs the parallel version would likely take longer due to the frontier set where as the sequential version will always take n*n iterations where a path exists. 

No comments:

Post a Comment