Sparse Array Utilities
Sparse Constructors
In addition to the Tensor constructor, Finch provides a number of convenience constructors for common tensor types. For example, the spzeros and sprand functions have fspzeros and fsprand counterparts that return Finch tensors. We can also construct a sparse COO Tensor from a list of indices and values using the fsparse function.
Finch.fsparse — Functionfsparse(I..., V,[ M::Tuple, combine]; fill_value=zero(eltype(V)))Create a sparse COO tensor S such that size(S) == M and S[(i[q] for i in I)...] = V[q]. The combine function is used to combine duplicates. If M is not specified, it is set to map(maximum, I). If the combine function is not supplied, combine defaults to + unless the elements of V are Booleans in which case combine defaults to |. All elements of I must satisfy 1 <= I[n][q] <= M[n]. Numerical zeros are retained as structural nonzeros; to drop numerical zeros use dropfills.
See also: sparse
Examples
julia> I = ([1, 2, 3], [1, 2, 3], [1, 2, 3]);
julia> V = [1.0, 2.0, 3.0];
julia> fsparse(I..., V)
3×3×3 Tensor{SparseCOOLevel{3, Tuple{Int64, Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}, Vector{Int64}}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
[:, :, 1] =
1.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
[:, :, 2] =
0.0 0.0 0.0
0.0 2.0 0.0
0.0 0.0 0.0
[:, :, 3] =
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 3.0Finch.fsparse! — Functionfsparse!(I..., V,[ M::Tuple]; fill_value=zero(eltype(V)))Like fsparse, but the coordinates must be sorted and unique, and memory is reused.
Finch.fsprand — Functionfsprand([rng], [type], M..., p, [rfn])Create a random sparse tensor of size m in COO format. There are two cases:
- If
pis a floating point number, the probability of any element being nonzero is independently given byp(and hence the expected density of nonzeros is alsop). - If
pis an integer, exactlypnonzeros are distributed uniformly at random throughout the tensor (and hence the density of nonzeros is exactlyp / prod(M)).
Nonzero values are sampled from the distribution specified by rfn and have the type type. The uniform distribution is used in case rfn is not specified. The optional rng argument specifies a random number generator.
See also: sprand
Examples
julia> fsprand(Bool, 3, 3, 0.5)
3×3 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{false, Bool, Int64, Vector{Bool}}}}
:
0 0 0
1 0 1
0 0 1
julia> fsprand(Float64, 2, 2, 2, 0.5)
2×2×2 Tensor{SparseCOOLevel{3, Tuple{Int64, Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}, Vector{Int64}}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
[:, :, 1] =
0.0 0.598969
0.963969 0.0
[:, :, 2] =
0.337409 0.0
0.0 0.0Finch.fspzeros — Functionfspzeros([type], M...)Create a zero tensor of size M, with elements of type type. The tensor is in COO format.
See also: spzeros
Examples
julia> A = fspzeros(Bool, 3, 3)
3×3 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{false, Bool, Int64, Vector{Bool}}}}:
0 0 0
0 0 0
0 0 0
julia> countstored(A)
0
julia> B = fspzeros(Float64, 2, 2, 2)
2×2×2 Tensor{SparseCOOLevel{3, Tuple{Int64, Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}, Vector{Int64}}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
[:, :, 1] =
0.0 0.0
0.0 0.0
[:, :, 2] =
0.0 0.0
0.0 0.0
julia> countstored(B)
0
Finch.ffindnz — Functionffindnz(arr)Return the nonzero elements of arr, as Finch understands arr. Returns (I..., V), where I are the coordinate vectors, one for each mode of arr, and V is a vector of corresponding nonzero values, which can be passed to fsparse.
See also: findnz
Fill Values
Finch tensors support an arbitrary "background" value for sparse arrays. While most arrays use 0 as the background value, this is not always the case. For example, a sparse array of Int might use typemin(Int) as the background value. The fill_value function returns the background value of a tensor. If you ever want to change the background value of an existing array, you can use the set_fill_value! function. The countstored function returns the number of stored elements in a tensor, and calling pattern! on a tensor returns tensor which is true whereever the original tensor stores a value. Note that countstored doesn't always return the number of non-zero elements in a tensor, as it counts the number of stored elements, and stored elements may include the background value. You can call dropfills! to remove explicitly stored background values from a tensor.
julia> A = fsparse([1, 1, 2, 3], [2, 4, 5, 6], [1.0, 2.0, 3.0])
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
0.0 1.0 0.0 2.0 0.0 0.0
0.0 0.0 0.0 0.0 3.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0
julia> min.(A, -1)
3×6 Tensor{DenseLevel{Int64, DenseLevel{Int64, ElementLevel{-1.0, Float64, Int64, Vector{Float64}}}}}:
-1.0 -1.0 -1.0 -1.0 -1.0 -1.0
-1.0 -1.0 -1.0 -1.0 -1.0 -1.0
-1.0 -1.0 -1.0 -1.0 -1.0 -1.0
julia> fill_value(A)
0.0
julia> B = set_fill_value!(A, -Inf)
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{-Inf, Float64, Int64, Vector{Float64}}}}:
-Inf 1.0 -Inf 2.0 -Inf -Inf
-Inf -Inf -Inf -Inf 3.0 -Inf
-Inf -Inf -Inf -Inf -Inf -Inf
julia> min.(B, -1)
3×6 Tensor{SparseDictLevel{Int64, Vector{Int64}, Vector{Int64}, Vector{Int64}, Dict{Tuple{Int64, Int64}, Int64}, Vector{Int64}, SparseDictLevel{Int64, Vector{Int64}, Vector{Int64}, Vector{Int64}, Dict{Tuple{Int64, Int64}, Int64}, Vector{Int64}, ElementLevel{-Inf, Float64, Int64, Vector{Float64}}}}}:
-Inf -1.0 -Inf -1.0 -Inf -Inf
-Inf -Inf -Inf -Inf -1.0 -Inf
-Inf -Inf -Inf -Inf -Inf -Inf
julia> countstored(A)
3
julia> pattern!(A)
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, PatternLevel{Int64}}}:
0 1 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 0Finch.fill_value — Functionfill_value(arr)Return the initializer for arr. For SparseArrays, this is 0. Often, the "fill" value becomes the "background" value of a tensor.
Finch.set_fill_value! — Functionset_fill_value!(fbr, init)Return a tensor which is equal to fbr, but with the fill (implicit) value set to init. May reuse memory and render the original tensor unusable when modified.
julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
2.0
0.0
3.0
0.0
4.0
0.0
5.0
0.0
6.0
0.0
julia> set_fill_value!(A, Inf)
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{Inf, Float64, Int64, Vector{Float64}}}}:
2.0
Inf
3.0
Inf
4.0
Inf
5.0
Inf
6.0
InfFinch.pattern! — Functionpattern!(fbr)Return the pattern of fbr. That is, return a tensor which is true wherever fbr is structurally unequal to its fill value. May reuse memory and render the original tensor unusable when modified.
julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
2.0
0.0
3.0
0.0
4.0
0.0
5.0
0.0
6.0
0.0
julia> pattern!(A)
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, PatternLevel{Int64}}}:
1
0
1
0
1
0
1
0
1
0Finch.countstored — Functioncountstored(arr)Return the number of stored elements in arr. If there are explicitly stored fill elements, they are counted too.
See also: SparseArrays.nnz and Base.summarysize
Finch.dropfills — Functiondropfills(src)Drop the fill values from src and return a new tensor with the same shape and format.
Finch.dropfills! — Functiondropfills!(dst, src)Copy only the non-fill values from src into dst.
How to tell whether an entry is "fill"
In the sparse world, a semantic distinction is sometimes made between "explicitly stored" values and "implicit" or "fill" values (usually zero). However, the formats in the Finch compiler represent a diverse set of structures beyond sparsity, and it is often unclear whether any of the values in the tensor are "explicit" (consider a mask matrix, which can be represented with a constant number of bits). Thus, Finch makes no semantic distinction between values which are stored explicitly or not. If users wish to make this distinction, they should instead store a tensor of tuples of the form (value, is_fill). For example,
julia> A = fsparse(
[1, 1, 2, 3],
[2, 4, 5, 6],
[(1.0, false), (0.0, true), (3.0, false)];
fill_value=(0.0, true),
)
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{(0.0, true), Tuple{Float64, Bool}, Int64, Vector{Tuple{Float64, Bool}}}}}:
(0.0, 1) (1.0, 0) (0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1)
(0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1) (3.0, 0) (0.0, 1)
(0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1)
julia> B = Tensor(Dense(SparseList(Element((0.0, true)))), A)
3×6 Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{(0.0, true), Tuple{Float64, Bool}, Int64, Vector{Tuple{Float64, Bool}}}}}}:
(0.0, 1) (1.0, 0) (0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1)
(0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1) (3.0, 0) (0.0, 1)
(0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1) (0.0, 1)
julia> sum(map(last, B))
16
julia> sum(map(first, B))
4.0Format Conversion and Storage Order
Converting Between Formats
You can convert between tensor formats with the Tensor constructor. Simply construct a new Tensor in the desired format and
julia> A = Tensor(CSCFormat(), [0 0 2 1; 0 0 1 0; 1 0 0 0])
3×4 Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}:
0.0 0.0 2.0 1.0
0.0 0.0 1.0 0.0
1.0 0.0 0.0 0.0
julia> B = Tensor(DCSCFormat(), A)
3×4 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}:
0.0 0.0 2.0 1.0
0.0 0.0 1.0 0.0
1.0 0.0 0.0 0.0Storage Order
By default, tensors in Finch are column-major. However, you can use the swizzle function to transpose them lazily. To convert to a transposed format, use the dropfills! function. Note that the permutedims function transposes eagerly.
Finch.swizzle — Functionswizzle(tns, dims...)Create a SwizzleArray to transpose any tensor tns such that
swizzle(tns, dims...)[i...] == tns[(i[d] for d in dims)...]julia> A = Tensor(CSCFormat(), [0 0 2 1; 0 0 1 0; 1 0 0 0])
3×4 Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}:
0.0 0.0 2.0 1.0
0.0 0.0 1.0 0.0
1.0 0.0 0.0 0.0
julia> tensor_tree(swizzle(A, 2, 1))
SwizzleArray (2, 1)
└─ 3×4-Tensor
└─ Dense [:,1:4]
├─ [:, 1]: SparseList (0.0) [1:3]
│ └─ [3]: 1.0
├─ [:, 2]: SparseList (0.0) [1:3]
├─ [:, 3]: SparseList (0.0) [1:3]
│ ├─ [1]: 2.0
│ └─ [2]: 1.0
└─ [:, 4]: SparseList (0.0) [1:3]
└─ [1]: 1.0
julia> tensor_tree(permutedims(A, (2, 1)))
4×3-Tensor
└─ SparseDict (0.0) [:,1:3]
├─ [:, 1]: SparseDict (0.0) [1:4]
│ ├─ [3]: 2.0
│ └─ [4]: 1.0
├─ [:, 2]: SparseDict (0.0) [1:4]
│ └─ [3]: 1.0
└─ [:, 3]: SparseDict (0.0) [1:4]
└─ [1]: 1.0
julia> dropfills!(swizzle(Tensor(CSCFormat()), 2, 1), A)
3×4 Finch.SwizzleArray{(2, 1), Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}}:
0.0 0.0 2.0 1.0
0.0 0.0 1.0 0.0
1.0 0.0 0.0 0.0