WeightedData

WeightedData.jl provides weighted numeric containers and likelihood utilities for uncertainty-aware estimation and model fitting.

Installation

using Pkg
Pkg.add("WeightedData")

Why WeightedData?

Keep values and their precisions together in a single, type-stable container.
Compute weighted statistics while handling missing or invalid measurements.
Evaluate likelihoods directly from weighted observations and model predictions.

Core concepts

WeightedValue(value, precision) stores a scalar observation and its precision.
WeightedArray(values, precisions) stores array-valued observations and per-entry precisions.
Precision is interpreted as inverse variance, $w = 1 / \sigma^2$.

Common workflows

julia> using WeightedData
julia> using WeightedData: filterbaddata!, get_value, get_precision, get_weights
julia> # Define arrays of values and precisions
       values = [ 1.0 missing π
               	0.1 10 NaN]2×3 Matrix{Union{Missing, Float64}}:
 1.0    missing    3.14159
 0.1  10.0       NaN
julia> precision = [0 missing 5
                   0.1 10 3.]2×3 Matrix{Union{Missing, Float64}}:
 0.0    missing  5.0
 0.1  10.0       3.0
julia> # Create a WeightedArray
       data = WeightedArray(values, precision)2×3 WeightedArray{Float64, 2} (alias of ZippedArrays.ZippedMatrix{WeightedValue{Float64}, 2, true, Tuple{Matrix{Float64}, Matrix{Float64}}})::
 1.0 ± Inf    0.0 ± Inf  3.14159 ± 0.45
 0.1 ± 3.2  10.0 ± 0.32       0.0 ± Inf
julia> # Missing and non-numeric values are ignored by setting their precision to zero
       get_precision(data)2×3 Matrix{Float64}:
 0.0   0.0  5.0
 0.1  10.0  0.0
julia> # Elements are `WeightedValue`s, displayed with their standard deviation
       data[2]WeightedValue{Float64}
0.1 ± 3.2
julia> # Weighted mean along a given dimension
       mean(data, dims=1)1×3 WeightedArray{Float64, 2} (alias of ZippedArrays.ZippedMatrix{WeightedValue{Float64}, 2, true, Tuple{Matrix{Float64}, Matrix{Float64}}})::
 0.1 ± 3.2  10.0 ± 0.32  3.14159 ± 0.45
julia> model = [1.0 2.0 3.
               3 	2. 	1. ]2×3 Matrix{Float64}:
 1.0  2.0  3.0
 3.0  2.0  1.0
julia> # Compute the negative log-likelihood for a given model
       l = loglikelihood(data, model)  # Default Gaussian negative log-likelihood320.47062119887653
julia> # Compute the derivative with automatic differentiation
       f(x) = loglikelihood(data, x)f (generic function with 1 method)
julia> using Zygote
julia> lkl, grad = Zygote.withgradient(f, model)(val = 320.47062119887653, grad = ([0.0 0.0 -0.7079632679489656; 0.29 -80.0 0.0],))
julia> # Use a robust loss from the RobustModels package
       using RobustModels
julia> l_robust = loglikelihood(data, model, loss=HuberLoss())18.56923830366538
julia> # Robust weights computed by the model (outlier weights are lower than weights of valid data points)
       get_weights(data, model;loss=HuberLoss())2×3 Matrix{Float64}:
 1.0  1.0        1.0
 1.0  0.0531658  1.0

Extensions

WeightedData.jl provides optional extensions that are activated automatically when the corresponding package is loaded:

RobustModels → robust loglikelihood taking any Losses defined in RobustModels
OnlineSampleStatistics → extract sample mean and sample variance of a series of observation to build a WeightedData (WeightedValue or WeightedArray)
ChainRulesCore → custom rrule methods for loglikelihood
Measurements → conversion between Measurement and WeightedValue
Uncertain → conversion between Uncertain.Value and WeightedValue

See API Reference for full method docstrings, including extension methods.

GPU support

WeightedData.jl supports GPU-backed weighted arrays through the WeightedDataGPUArraysExt extension, activated automatically when GPUArrays.jl is loaded.

Example (CUDA)

using WeightedData
using CUDA
using RobustModels

values = CUDA.ones(Float32, 1024)
precisions = CUDA.fill(Float32(2), 1024)
data = WeightedArray(values, precisions)
model = CUDA.fill(Float32(0.9), 1024)

# Gaussian (default) negative log-likelihood
ℓ2 = loglikelihood(data, model)

# Robust negative log-likelihood (requires RobustModels extension)
ℓh = loglikelihood(data, model, loss=HuberLoss())

Notes:

Backend choice is delegated to your GPU array package (for example CUDA.jl, AMDGPU.jl, oneAPI.jl).