Skip to content

Commit a29d07a

Browse files
angus-gmicaeljtoliveira
authored andcommitted
Drastically improve efficiency of tripole generation
In the tripole, we use the make_topo_gen routine. This was allocating two arrays with 20,000,000 elements to hold topography for the patch. The target grid is divided into blocks of 100x25 points, and for each point in these blocks, for ever block, both of the large arrays were being completely zeroed. This meant that performance was almost completely dominated by memset. Isolating to just 10 of these blocks (but including NetCDF input and output), I saw 98.2% of the CPU time spent in memset, taking over 2 minutes to process. With this patch, it takes about 5 seconds to process the blocks, where most of that time is spent in KD tree generation and NetCDF input/output. An entire invocation of gen_topo from GEBCO to a 1/10 full-globe grid went down from around 2 hours to 6 minutes.
1 parent cf776e1 commit a29d07a

File tree

1 file changed

+7
-4
lines changed

1 file changed

+7
-4
lines changed

src/gen_topo.f90

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -581,7 +581,6 @@ subroutine make_topo_gen(topo_in, x_in, y_in, topo_out, x_out, y_out, topo_all_o
581581
tree => kdtree2_create(possie, sort =.false., rearrange =.true.)
582582

583583
num_max = 20000000
584-
allocate(t_s(num_max), t_s_all(num_max))
585584

586585
allocate(results(num_max))
587586
do j = 1, jm
@@ -610,14 +609,16 @@ subroutine make_topo_gen(topo_in, x_in, y_in, topo_out, x_out, y_out, topo_all_o
610609
cycle
611610
end if
612611

612+
num_found = min(num_found, num_max)
613+
614+
allocate(t_s(num_found), t_s_all(num_found), source=0.0)
615+
613616
wet_in_poly = 0
614617
in_poly = 0
615-
t_s = 0
616-
t_s_all = 0
617618

618619
bndsx = [x_out(i, j), x_out(i+1, j), x_out(i+1, j+1), x_out(i, j+1)]
619620
bndsy = [y_out(i, j), y_out(i+1, j), y_out(i+1, j+1), y_out(i, j+1)]
620-
do n = 1, min(num_found, num_max)
621+
do n = 1, num_found
621622
ib = idx(results(n)%idx)
622623
jb = jdx(results(n)%idx)
623624
!if (topo_in(ib, jb) > 0.0) cycle
@@ -645,6 +646,8 @@ subroutine make_topo_gen(topo_in, x_in, y_in, topo_out, x_out, y_out, topo_all_o
645646
call quicksort(t_s, frst, lst)
646647
topo_med_out(i, j) = t_s(max(wet_in_poly/2, 1))
647648
end if
649+
650+
deallocate(t_s, t_s_all)
648651
end do
649652
end do
650653

0 commit comments

Comments
 (0)