datashader

datashader(points::AbstractVector{<: Point})

Warning

This feature might change outside breaking releases, since the API is not yet finalized. Please be wary of bugs in the implementation and open issues if you encounter odd behaviour.

Points can be any array type supporting iteration & getindex, including memory mapped arrays. If you have separate arrays for x and y coordinates and want to avoid conversion and copy, consider using:

using Makie.StructArrays
points = StructArray{Point2f}((x, y))
datashader(points)

Do pay attention though, that if x and y don't have a fast iteration/getindex implemented, this might be slower than just copying the data into a new array.

For best performance, use method=Makie.AggThreads() and make sure to start julia with julia -tauto or have the environment variable JULIA_NUM_THREADS set to the number of cores you have.

Plot type

The plot type alias for the datashader function is DataShader.

Attributes

agg = AggCount() — Can be AggCount(), AggAny() or AggMean(). User-extensible by overloading:

struct MyAgg{T} <: Makie.AggOp end
MyAgg() = MyAgg{Float64}()
Makie.Aggregation.null(::MyAgg{T}) where {T} = zero(T)
Makie.Aggregation.embed(::MyAgg{T}, x) where {T} = convert(T, x)
Makie.Aggregation.merge(::MyAgg{T}, x::T, y::T) where {T} = x + y
Makie.Aggregation.value(::MyAgg{T}, x::T) where {T} = x

alpha = 1.0 — The alpha value of the colormap or color attribute. Multiple alphas like in plot(alpha=0.2, color=(:red, 0.5), will get multiplied.

async = true — Will calculate get_aggregation in a task, and skip any zoom/pan updates while busy. Great for interaction, but must be disabled for saving to e.g. png or when inlining in Documenter.

binsize = 1 — Factor defining how many bins one wants per screen pixel. Set to n > 1 if you want a coarser image.

colormap = @inherit colormap :viridis — Sets the colormap that is sampled for numeric colors. PlotUtils.cgrad(...), Makie.Reverse(any_colormap) can be used as well, or any symbol from ColorBrewer or PlotUtils. To see all available color gradients, you can call Makie.available_gradients().

colorrange = automatic — The values representing the start and end points of colormap.

colorscale = identity — The color transform function. Can be any function, but only works well together with Colorbar for identity, log, log2, log10, sqrt, logit, Makie.pseudolog10 and Makie.Symlog10.

depth_shift = 0.0 — adjusts the depth value of a plot after all other transformations, i.e. in clip space, where 0 <= depth <= 1. This only applies to GLMakie and WGLMakie and can be used to adjust render order (like a tunable overdraw).

fxaa = true — adjusts whether the plot is rendered with fxaa (anti-aliasing, GLMakie only).

highclip = automatic — The color for any value above the colorrange.

inspectable = true — sets whether this plot should be seen by DataInspector.

inspector_clear = automatic — Sets a callback function (inspector, plot) -> ... for cleaning up custom indicators in DataInspector.

inspector_hover = automatic — Sets a callback function (inspector, plot, index) -> ... which replaces the default show_data methods.

inspector_label = automatic — Sets a callback function (plot, index, position) -> string which replaces the default label generated by DataInspector.

interpolate = true — If the resulting image should be displayed interpolated.

local_operation = identity — Function which gets called on each element after the aggregation (map!(x-> local_operation(x), final_aggregation_result)).

lowclip = automatic — The color for any value below the colorrange.

method = AggThreads() — Can be AggThreads() or AggSerial() for threaded vs. serial aggregation.

model = automatic — Sets a model matrix for the plot. This overrides adjustments made with translate!, rotate! and scale!.

nan_color = :transparent — The color for NaN values.

operation = automatic — Defaults to Makie.equalize_histogram function which gets called on the whole getaggregation array before display (`operation(finalaggregation_result)`).

overdraw = false — Controls if the plot will draw over other plots. This specifically means ignoring depth checks in GL backends

point_transform = identity — Function which gets applied to every point before aggregating it.

show_timings = false — Set to true to show how long it takes to aggregate each frame.

space = :data — sets the transformation space for box encompassing the plot. See Makie.spaces() for possible inputs.

ssao = false — Adjusts whether the plot is rendered with ssao (screen space ambient occlusion). Note that this only makes sense in 3D plots and is only applicable with fxaa = true.

transformation = automaticNo docs available.

transparency = false — Adjusts how the plot deals with transparency. In GLMakie transparency = true results in using Order Independent Transparency.

visible = true — Controls whether the plot will be rendered or not.

Examples

Airports

using DelimitedFiles, GLMakie
# For saving/showing/inlining into documentation we need to disable async calculation.
Makie.set_theme!(DataShader = (; async=false))
airports = Point2f.(eachrow(readdlm(assetpath("airportlocations.csv"))))
fig, ax, ds = datashader(airports,
    colormap=[:white, :black],
    # use type=Axis, so that Makie doesn't need to infer
    # the axis type, which can be expensive for a large amount of points
    axis = (; type=Axis),
    # for documentation output we shouldn't calculate the image async,
    # since it won't wait for the render to finish and inline a blank image
    async = false,
    figure = (; figurepadding=0, size=(360*2, 160*2))
)
Colorbar(fig[1, 2], ds, label="Number of airports")
hidedecorations!(ax); hidespines!(ax)
fig

Strange Attractors

# Taken from Lazaro Alonso in Beautiful Makie:
# https://beautiful.makie.org/dev/examples/generated/2d/datavis/strange_attractors/?h=cliffo#trajectory
Clifford((x, y), a, b, c, d) = Point2f(sin(a * y) + c * cos(a * x), sin(b * x) + d * cos(b * y))

function trajectory(fn, x0, y0, kargs...; n=1000) #  kargs = a, b, c, d
    xy = zeros(Point2f, n + 1)
    xy[1] = Point2f(x0, y0)
    @inbounds for i in 1:n
        xy[i+1] = fn(xy[i], kargs...)
    end
    return xy
end

cargs = [[0, 0, -1.3, -1.3, -1.8, -1.9],
    [0, 0, -1.4, 1.6, 1.0, 0.7],
    [0, 0, 1.7, 1.7, 0.6, 1.2],
    [0, 0, 1.7, 0.7, 1.4, 2.0],
    [0, 0, -1.7, 1.8, -1.9, -0.4],
    [0, 0, 1.1, -1.32, -1.03, 1.54],
    [0, 0, 0.77, 1.99, -1.31, -1.45],
    [0, 0, -1.9, -1.9, -1.9, -1.0],
    [0, 0, 0.75, 1.34, -1.93, 1.0],
    [0, 0, -1.32, -1.65, 0.74, 1.81],
    [0, 0, -1.6, 1.6, 0.7, -1.0],
    [0, 0, -1.7, 1.5, -0.5, 0.7]
]

fig = Figure(size=(1000, 1000))
fig_grid = CartesianIndices((3, 4))
cmap = to_colormap(:BuPu_9)
cmap[1] = RGBAf(1, 1, 1, 1) # make sure background is white

let
    # locally, one can go pretty high with n_points,
    # e.g. 4*(10^7), but we don't want the docbuild to become too slow.
    n_points = 10^6
    for (i, arg) in enumerate(cargs)
        points = trajectory(Clifford, arg...; n=n_points)
        r, c = Tuple(fig_grid[i])
        ax, plot = datashader(fig[r, c], points;
            colormap=cmap,
            async=false,
            axis=(; type=Axis, title=join(string.(arg), ", ")))
        hidedecorations!(ax)
        hidespines!(ax)
    end
end
rowgap!(fig.layout,5)
colgap!(fig.layout,1)
fig

Bigger examples

Timings in the comments are from running this on a 32gb Ryzen 5800H laptop. Both examples aren't fully optimized yet, and just use raw, unsorted, memory mapped Point2f arrays. In the future, we'll want to add acceleration structures to optimize access to sub-regions.

14 million point NYC taxi dataset

using GLMakie, Downloads, Parquet2
bucket = "https://ursa-labs-taxi-data.s3.us-east-2.amazonaws.com"
year = 2009
month = "01"
filename = join([year, month, "data.parquet"], "/")
out = joinpath("$year-$month-data.parquet")
url = bucket * "/" * filename
Downloads.download(url, out)
# Loading ~1.5s
@time begin
    ds = Parquet2.Dataset(out)
    dlat = Parquet2.load(ds, "dropoff_latitude")
    dlon = Parquet2.load(ds, "dropoff_longitude")
    # One could use struct array here, but dlon/dlat are
    # a custom array type from Parquet2 supporting missing and some other things, which slows the whole thing down.
    # points = StructArray{Point2f}((dlon, dlat))
    points = Point2f.(dlon, dlat)
    groups = Parquet2.load(ds, "vendor_id")
end

# ~0.06s
@time begin
    f, ax, dsplot = datashader(points;
        colormap=:fire,
        axis=(; type=Axis, autolimitaspect = 1),
        figure=(;figure_padding=0, size=(1200, 600))
    )
    # make image fill the whole screen
    hidedecorations!(ax)
    hidespines!(ax)
    display(f)
end

2.7 billion OSM GPS points

Download the data from OSM GPS points and use the updated script from drawing-2-7-billion-points-in-10s to convert the CSV to a binary blob that we can memory map.

using GLMakie, Mmap
path = "gpspoints.bin"

points = Mmap.mmap(open(path, "r"), Vector{Point2f});
# ~ 26s
@time begin
    f, ax, pl = datashader(
        points;
        # For a big dataset its interesting to see how long each aggregation takes
        show_timings = true,
        # Use a local operation which is faster to calculate and looks good!
        local_post=x-> log10(x + 1),
        #=
            in the code we used to save the binary, we had the points in the wrong order.
            A good chance to demonstrate the `point_transform` argument,
            Which gets applied to every point before aggregating it
        =#
        point_transform=reverse,
        axis=(; type=Axis, autolimitaspect = 1),
        figure=(;figure_padding=0, size=(1200, 600))
    )
    hidedecorations!(ax)
    hidespines!(ax)
    display(f)
end
aggregation took 1.395s
aggregation took 1.176s
aggregation took 0.81s
aggregation took 0.791s
aggregation took 0.729s
aggregation took 1.272s
aggregation took 1.117s
aggregation took 0.866s
aggregation took 0.724s

Categorical Data

There are two ways to plot categorical data right now:

datashader(one_category_per_point, points)
datashader(Dict(:category_a => all_points_a, :category_b => all_points_b))

The type of the category doesn't matter, but will get converted to strings internally, to be displayed nicely in the legend. Categories are currently aggregated in one Canvas per category, and then overlayed with alpha blending.

normaldist = randn(Point2f, 1_000_000)
ds1 = normaldist .+ (Point2f(-1, 0),)
ds2 = normaldist .+ (Point2f(1, 0),)
fig, ax, pl = datashader(Dict("a" => ds1, "b" => ds2))
hidedecorations!(ax)
fig

We can also re-use the previous NYC example for a categorical plot:

@time begin
    f = Figure(figure_padding=0, size=(1200, 600))
    ax = Axis(
        f[1, 1],
        autolimitaspect=1,
        limits=(-74.022, -73.827, 40.696, 40.793),
        backgroundcolor=:black
    )
    datashader!(ax, groups, points)
    hidedecorations!(ax)
    hidespines!(ax)
    # Create a styled legend
    axislegend("Vendor ID"; titlecolor=:white, framecolor=:grey, polystrokewidth=2, polystrokecolor=(:white, 0.5), rowgap=10, backgroundcolor=:black, labelcolor=:white)
    display(f)
end

Advanced API

The datashader recipe makes it very simple to get started, and is also efficiently implemented so that most changes like f, ax, pl = datashader(...); pl.colorrange=new_range; pl.operation=log10 won't redo the aggregation. But if you still need more manual control, one can also use the underlying Canvas API directly for more manual control:

Canvas(bounds::Rect2; resolution::Tuple{Int,Int}=(800, 800), op=AggCount())
Canvas(xmin::Number, xmax::Number, ymin::Number, ymax::Number; args...)

Example

using Makie
canvas = Canvas(-1, 1, -1, 1; op=AggCount(), resolution=(800, 800))
aggregate!(canvas, points; point_transform=reverse, method=AggThreads())
aggregated_values = get_aggregation(canvas; operation=equalize_histogram, local_operation=identiy)
# Recipes are defined for canvas as well and incorperate the `get_aggregation`, but `aggregate!` must be called manually.
image!(canvas; operation=equalize_histogram, local_operation=identiy, colormap=:viridis, colorrange=(0, 20))
surface!(canvas; operation=equalize_histogram, local_operation=identiy)
aggregate!(c::Canvas, points; point_transform=identity, method::AggMethod=AggSerial())

Aggregate points into a canvas. The points are transformed by point_transform before aggregation. Method can be AggSerial() or AggThreads().