Drawing 2.7 billion Points in 10s
When I was reading this blog post about visualizing 2.7 billion points in a minute on a macbook, I got curious how fast this would be in Julia.
Not only for fun, but also to integrate it into my plotting library (MakiE). Since I’ve been very happy at how quickly I was able to create a very fast solution, I decided to share my experience!
To get started, one can download the data from http://planet.osm.org/gps/
Parsing the CSV
I'm assuming that one wants to do lots of operations on the data, so the most sensible thing to do is to convert the 55gb CSV to a 22gb binary blob, which can get memory mapped and doesn't need parsing!
There wasn't any tool around that could do this out of the box, but with the help of TextParser.jl I was able to whip up a fairly fast custom solution - which also demonstrates how nicely one can extend existing Julia libraries.
The first problem is, that TextParser expects a string to parse the CSV. Since I can't just load the whole 55gb dataset into RAM, our first trick is to create a memory mapped string type in Julia:
struct MMapString{T}
# using Type parameter to not write out the result type of mmap
# but I also don't want to loose performance
x::T
end# Constructor from a path
MMapString(path::String) = MMapString(Mmap.mmap(open(path), Vector{UInt8}))
# we only need to overload the iteration protocol for TextParse to work!
Base.getindex(x::MMapString, i) = Char(x.x[i])
Base.next(x::MMapString, i) = Char(x.x[i]), i + 1
Base.done(x::MMapString, i) = i > length(x.x)
Base.start(x::MMapString, i) = 1
Base.length(x::MMapString) = length(x.x)
This is all one needs to satisfy the basic interface of most libraries expecting a string like type. Now we can write a function to read the 55gb dataset, parse it line by line and write it into a binary blob.
One of the problems that occurred was, that the function to write to an IO file stream was allocating quite a bit. Base.write creates a mutable container (Ref) for its argument to safely pass it to C. Even Base.unsafe_write that directly takes a pointer still has a check that allocates 128 bytes.
Be ready for unsafe_unsafe_write, short uuwrite. This saves us 40 seconds and allocates 3.5 gb less memory (which ends up as ~1.2x faster)!
This is actually a great example of how Julia currently fails, and how one can still claw back the performance.
We just call the C library directly and use the almost deprecated syntax (`&`) to take a pointer from the stack allocated argument.
in Julia 0.7 this is likely no issue and we should be able to just use `write`, since the compiler got a lot better at stack allocating and eliminating mutable containers.
By the way I was figuring this out with @time write(io, 1f0) which shows the allocations, and then using @which write(io, 1f0) (or alternatively @edit to jump directly into the editor) to figure out where the function is defined in Julia and where the allocations come from.
using TextParse
using TextParse: Record, Field, Numeric, tryparsenextfunction uuwrite(io, ptr::T) where T
Int(ccall(
:ios_write,
Csize_t, (Ptr{Void}, Ptr{T}, Csize_t),
io.ios, &ptr, sizeof(T))
)
endfunction save2bin(path, n = typemax(Int))
str = MMapString(path)
# Descriptor of 'num,num\n' which is the format in the csv
rec = Record((
Field(Numeric(Float32), delim = ','),
Field(Numeric(Float32), eoldelim = true)
))
# skip the header! Nice is, that Julia's findfirst works with
# any iterator
pos = findfirst(str, '\n') + 1
io = open(homedir()*"/gpspoints.bin", "w")
i = 0
while !done(str, pos) && i <= n
# tryparsenext takes roughly 35ns so it's fairly optimal
p_or_null, pos = tryparsenext(
rec, str, pos, length(str)
)
isnull(p_or_null) && continue # continue when parsing fails
p = get(p_or_null)
uuwrite(io, p)
i += 1
end
close(io)
i
end
tic()
@time save2bin(homedir() * "/Downloads/gps-points.csv");
toc()
This rewrites the file and saves it as a binary in 190 seconds, which isn't too bad for such a simple implementation and an operation that only needs to be done one time.
Drawing the Image
Now we can memory map the file as a Vector{NTuple{2, Float32}} which holds the raw gps coordinates. We can then directly loop over the points and accumulate each position in a 2D array.
"""
Transforms from longitude/latitude to pixel on screen, with `dims` refering to
the dimensions of the screen in pixel
"""
@inline function gps2pixel(point, dims)
lon, lat = point[1], point[2]
x = ((dims[1] / 180.0) * (90.0 + (lon / 10^7)))
y = ((dims[2] / 360.0) * (180.0 - (lat / 10^7)))
(x, y)
endfunction to_image_inner!(img, points, start, stop)
dims = size(img)
for i = start:stop
# don't check bounds when indexing into an array
@inbounds begin
p0 = gps2pixel(points[i], dims)
idx = Int.(round.(p0))
xidx, yidx = dims[1] - idx[1], dims[2] - idx[2]
if checkbounds(Bool, img, xidx, yidx)
# we should give each point a radius and then add
# the coverage to each pixel for a smoother image
# But this does well enough for this short example:
img[xidx, yidx] += 0.001f0 # accumulate
end
end
end
end
function to_image!(img, points, range)
N = length(range)
# Number of threads Julia has available
# Use the environment variable JULIA_NUM_THREADS=8 to start
# Julia with 8 threads
NT = Threads.nthreads()
slices = floor(Int, N / NT)
offset = minimum(range)
Threads.@threads for i = 1:NT
# @threads creates a closure, which sometimes
# confuses type inference leading to slow code.
# this is why it's a good practice to move the loop body
# behind a function barrier
# (https://docs.julialang.org/en/latest/manual/performance-tips/#kernal-functions-1)
to_image_inner!(
img, points,
offset + ((i-1) * slices + 1), offset + (i * slices)
)
end
return img
end
We can use the above function in the following way to get a simple gray scale image:
img = zeros(Float32, 600, 960)
io = open(homedir() * "/gpspoints.bin")
points = Mmap.mmap(io, Vector{NTuple{2, Float32}})
tic()
to_image!(img, points, 1:length(points))
toc()
using FileIO, Colors # save it with FileIO as a Gray image
FileIO.save("gps.png", Gray.(clamp.(1f0 .- img, 0, 1)))
Et voila!
This took 10 seconds to draw 22gb of points on a normal desktop computer.
One thing to point out is the great dot call syntax you can see in the example:
Gray.(clamp.(1f0 .- img, 0, 1)))
This is actually a wonderful mechanism to apply a function per element on an array while fusing all operations. So the above will turn into a single for loop
over img, clamping, inverting and converting it to a color in one go!
We can also use my new interactive plotting library to create a nice animation of the progress:
using MakiEresolution = (600, 960)
scene = Scene(resolution = reverse(resolution))
img = zeros(Float32, resolution)
imviz = heatmap(img, colornorm = (0, 1))
center!(scene)
# slice `points` into multiple batches which we will update
slice = 10^7
range = slice:slice:length(points)
stop = 1
vio = VideoStream(scene, homedir()*"/Desktop/", "gps")
while true
start = stop
stop = min(stop + slice, length(points))
to_image!(img, points, start:stop)
imviz[:heatmap] = img # update heatmap plot
recordframe!(vio) # add frame to stream
stop == length(points) && break
end
finish(vio, "gif") # finish streaming and export as gif!
There are of course various improvements we could apply, but I’m very happy with this as a first result. It is also not clear how comparable this solution is to the initially mentioned blog post, since I haven’t run that on my computer yet. If you want to try this on your machine, check out the full code!