datashader

datashader(points::AbstractVector{<: Point})

Warning

This feature might change outside breaking releases, since the API is not yet finalized. Please be vary of bugs in the implementation and open issues if you encounter odd behaviour.

Points can be any array type supporting iteration & getindex, including memory mapped arrays. If you have separate arrays for x and y coordinates and want to avoid conversion and copy, consider using:

using Makie.StructArrays
points = StructArray{Point2f}((x, y))
datashader(points)

Do pay attention though, that if x and y don't have a fast iteration/getindex implemented, this might be slower then just copying it into a new array.

For best performance, use method=Makie.AggThreads() and make sure to start julia with julia -tauto or have the environment variable JULIA_NUM_THREADS set to the number of cores you have.

Attributes

Specific to DataShader

  • agg = AggCount() can be AggCount(), AggAny() or AggMean(). User extendable by overloading:

```Julia
    struct MyAgg{T} <: Makie.AggOp end
    MyAgg() = MyAgg{Float64}()
    Makie.Aggregation.null(::MyAgg{T}) where {T} = zero(T)
    Makie.Aggregation.embed(::MyAgg{T}, x) where {T} = convert(T, x)
    Makie.Aggregation.merge(::MyAgg{T}, x::T, y::T) where {T} = x + y
    Makie.Aggregation.value(::MyAgg{T}, x::T) where {T} = x
```
  • method = AggThreads() can be AggThreads() or AggSerial().

  • async::Bool = true will calculate get_aggregation in a task, and skip any zoom/pan updates while busy. Great for interaction, but must be disabled for saving to e.g. png or when inlining in documenter.

  • operation::Function = automatic Defaults to Makie.equalize_histogram function which gets called on the whole getaggregation array before display (`operation(finalaggregation_result)`).

  • local_operation::Function = identity function which gets call on each element after the aggregation (map!(x-> local_operation(x), final_aggregation_result)).

  • point_transform::Function = identity function which gets applied to every point before aggregating it.

  • binsize::Number = 1 factor defining how many bins one wants per screen pixel. Set to n > 1 if you want a corser image.

  • show_timings::Bool = false show how long it takes to aggregate each frame.

  • interpolate::Bool = true If the resulting image should be displayed interpolated.

Color attributes

  • colormap::Union{Symbol, Vector{<:Colorant}} = :viridis sets the colormap that is sampled for numeric colors. PlotUtils.cgrad(...), Makie.Reverse(any_colormap) can be used as well, or any symbol from ColorBrewer or PlotUtils. To see all available color gradients, you can call Makie.available_gradients().

  • colorscale::Function = identity color transform function. Can be any function, but only works well together with Colorbar for identity, log, log2, log10, sqrt, logit, Makie.pseudolog10 and Makie.Symlog10.

  • colorrange::Tuple{<:Real, <:Real} sets the values representing the start and end points of colormap.

  • nan_color::Union{Symbol, <:Colorant} = RGBAf(0,0,0,0) sets a replacement color for color = NaN.

  • lowclip::Union{Nothing, Symbol, <:Colorant} = nothing sets a color for any value below the colorrange.

  • highclip::Union{Nothing, Symbol, <:Colorant} = nothing sets a color for any value above the colorrange.

  • alpha = 1.0 sets the alpha value of the colormap or color attribute. Multiple alphas like in plot(alpha=0.2, color=(:red, 0.5), will get multiplied.

Generic attributes

  • visible::Bool = true sets whether the plot will be rendered or not.

  • overdraw::Bool = false sets whether the plot will draw over other plots. This specifically means ignoring depth checks in GL backends.

  • transparency::Bool = false adjusts how the plot deals with transparency. In GLMakie transparency = true results in using Order Independent Transparency.

  • fxaa::Bool = true adjusts whether the plot is rendered with fxaa (anti-aliasing).

  • inspectable::Bool = true sets whether this plot should be seen by DataInspector.

  • depth_shift::Float32 = 0f0 adjusts the depth value of a plot after all other transformations, i.e. in clip space, where 0 <= depth <= 1. This only applies to GLMakie and WGLMakie and can be used to adjust render order (like a tunable overdraw).

  • model::Makie.Mat4f sets a model matrix for the plot. This replaces adjustments made with translate!, rotate! and scale!.

  • space::Symbol = :data sets the transformation space for box encompassing the volume plot. See Makie.spaces() for possible inputs.

Examples

Airports

using DelimitedFiles, GLMakie
# For saving/showing/inlining into documentation we need to disable async calculation.
Makie.set_theme!(DataShader = (; async=false))
airports = Point2f.(eachrow(readdlm(assetpath("airportlocations.csv"))))
fig, ax, ds = datashader(airports,
    colormap=[:white, :black],
    # use type=Axis, so that Makie doesn't need to infer
    # the axis type, which can be expensive for a large amount of points
    axis = (; type=Axis),
    # for documentation output we shouldn't calculate the image async,
    # since it won't wait for the render to finish and inline a blank image
    async = false,
    figure = (; figurepadding=0, size=(360*2, 160*2))
)
Colorbar(fig[1, 2], ds, label="Number of airports")
hidedecorations!(ax); hidespines!(ax)
fig

Strange Attractors

# Taken from Lazaro Alonso in Beautiful Makie:
# https://beautiful.makie.org/dev/examples/generated/2d/datavis/strange_attractors/?h=cliffo#trajectory
Clifford((x, y), a, b, c, d) = Point2f(sin(a * y) + c * cos(a * x), sin(b * x) + d * cos(b * y))

function trajectory(fn, x0, y0, kargs...; n=1000) #  kargs = a, b, c, d
    xy = zeros(Point2f, n + 1)
    xy[1] = Point2f(x0, y0)
    @inbounds for i in 1:n
        xy[i+1] = fn(xy[i], kargs...)
    end
    return xy
end

cargs = [[0, 0, -1.3, -1.3, -1.8, -1.9],
    [0, 0, -1.4, 1.6, 1.0, 0.7],
    [0, 0, 1.7, 1.7, 0.6, 1.2],
    [0, 0, 1.7, 0.7, 1.4, 2.0],
    [0, 0, -1.7, 1.8, -1.9, -0.4],
    [0, 0, 1.1, -1.32, -1.03, 1.54],
    [0, 0, 0.77, 1.99, -1.31, -1.45],
    [0, 0, -1.9, -1.9, -1.9, -1.0],
    [0, 0, 0.75, 1.34, -1.93, 1.0],
    [0, 0, -1.32, -1.65, 0.74, 1.81],
    [0, 0, -1.6, 1.6, 0.7, -1.0],
    [0, 0, -1.7, 1.5, -0.5, 0.7]
]

fig = Figure(size=(1000, 1000))
fig_grid = CartesianIndices((3, 4))
cmap = to_colormap(:BuPu_9)
cmap[1] = RGBAf(1, 1, 1, 1) # make sure background is white

let
    # locally, one can go pretty high with n_points,
    # e.g. 4*(10^7), but we don't want the docbuild to become too slow.
    n_points = 10^6
    for (i, arg) in enumerate(cargs)
        points = trajectory(Clifford, arg...; n=n_points)
        r, c = Tuple(fig_grid[i])
        ax, plot = datashader(fig[r, c], points;
            colormap=cmap,
            async=false,
            axis=(; type=Axis, title=join(string.(arg), ", ")))
        hidedecorations!(ax)
        hidespines!(ax)
    end
end
rowgap!(fig.layout,5)
colgap!(fig.layout,1)
fig

Bigger examples

Timings in the comments are from running this on a 32gb Ryzen 5800H laptop. Both examples aren't fully optimized yet, and just use raw, unsorted, memory mapped Point2f arrays. In the future, we'll want to add acceleration structures to optimize access to sub-regions.

14 million point NYC taxi dataset

using GLMakie, Downloads, Parquet2
bucket = "https://ursa-labs-taxi-data.s3.us-east-2.amazonaws.com"
year = 2009
month = "01"
filename = join([year, month, "data.parquet"], "/")
out = joinpath("$year-$month-data.parquet")
url = bucket * "/" * filename
Downloads.download(url, out)
# Loading ~1.5s
@time begin
    ds = Parquet2.Dataset(out)
    dlat = Parquet2.load(ds, "dropoff_latitude")
    dlon = Parquet2.load(ds, "dropoff_longitude")
    # One could use struct array here, but dlon/dlat are
    # a custom array type from Parquet2 supporting missing and some other things, which slows the whole thing down.
    # points = StructArray{Point2f}((dlon, dlat))
    points = Point2f.(dlon, dlat)
    groups = Parquet2.load(ds, "vendor_id")
end

# ~0.06s
@time begin
    f, ax, dsplot = datashader(points;
        colormap=:fire,
        axis=(; type=Axis, autolimitaspect = 1),
        figure=(;figure_padding=0, size=(1200, 600))
    )
    # make image fill the whole screen
    hidedecorations!(ax)
    hidespines!(ax)
    display(f)
end

2.7 billion OSM GPS points

Download the data from OSM GPS points and use the updated script from drawing-2-7-billion-points-in-10s to convert the CSV to a binary blob that we can memory map.

using GLMakie, Mmap
path = "gpspoints.bin"

points = Mmap.mmap(open(path, "r"), Vector{Point2f});
# ~ 26s
@time begin
    f, ax, pl = datashader(
        points;
        # For a big dataset its interesting to see how long each aggregation takes
        show_timings = true,
        # Use a local operation which is faster to calculate and looks good!
        local_post=x-> log10(x + 1),
        #=
            in the code we used to save the binary, we had the points in the wrong order.
            A good chance to demonstrate the `point_transform` argument,
            Which gets applied to every point before aggregating it
        =#
        point_transform=reverse,
        axis=(; type=Axis, autolimitaspect = 1),
        figure=(;figure_padding=0, size=(1200, 600))
    )
    hidedecorations!(ax)
    hidespines!(ax)
    display(f)
end
aggregation took 1.395s
aggregation took 1.176s
aggregation took 0.81s
aggregation took 0.791s
aggregation took 0.729s
aggregation took 1.272s
aggregation took 1.117s
aggregation took 0.866s
aggregation took 0.724s

Categorical Data

There are two ways to plot categorical data right now:

datashader(one_category_per_point, points)
datashader(Dict(:category_a => all_points_a, :category_b => all_points_b))

The type of the category doesn't matter, but will get converted to strings internally, to be displayed nicely in the legend. Categories are currently aggregated in one Canvas per category, and then overlayed with alpha blending.

normaldist = randn(Point2f, 1_000_000)
ds1 = normaldist .+ (Point2f(-1, 0),)
ds2 = normaldist .+ (Point2f(1, 0),)
fig, ax, pl = datashader(Dict("a" => ds1, "b" => ds2))
hidedecorations!(ax)
fig

We can also re-use the previous NYC example for a categorical plot:

@time begin
    f = Figure(figure_padding=0, size=(1200, 600))
    ax = Axis(
        f[1, 1],
        autolimitaspect=1,
        limits=(-74.022, -73.827, 40.696, 40.793),
        backgroundcolor=:black
    )
    datashader!(ax, groups, points)
    hidedecorations!(ax)
    hidespines!(ax)
    # Create a styled legend
    axislegend("Vendor ID"; titlecolor=:white, framecolor=:grey, polystrokewidth=2, polystrokecolor=(:white, 0.5), rowgap=10, backgroundcolor=:black, labelcolor=:white)
    display(f)
end

Advanced API

The datashader recipe makes it very simple to get started, and is also efficiently implemented so that most changes like f, ax, pl = datashader(...); pl.colorrange=new_range; pl.operation=log10 won't redo the aggregation. But if you still need more manual control, one can also use the underlying Canvas API directly for more manual control:

Canvas(bounds::Rect2; resolution::Tuple{Int,Int}=(800, 800), op=AggCount())
Canvas(xmin::Number, xmax::Number, ymin::Number, ymax::Number; args...)

Example

using Makie
canvas = Canvas(-1, 1, -1, 1; op=AggCount(), resolution=(800, 800))
aggregate!(canvas, points; point_transform=reverse, method=AggThreads())
aggregated_values = get_aggregation(canvas; operation=equalize_histogram, local_operation=identiy)
# Recipes are defined for canvas as well and incorperate the `get_aggregation`, but `aggregate!` must be called manually.
image!(canvas; operation=equalize_histogram, local_operation=identiy, colormap=:viridis, colorrange=(0, 20))
surface!(canvas; operation=equalize_histogram, local_operation=identiy)
aggregate!(c::Canvas, points; point_transform=identity, method::AggMethod=AggSerial())

Aggregate points into a canvas. The points are transformed by point_transform before aggregation. Method can be AggSerial() or AggThreads().