Sparse Array Utilities

Sparse Constructors

In addition to the Tensor constructor, Finch provides a number of convenience constructors for common tensor types. For example, the spzeros and sprand functions have fspzeros and fsprand counterparts that return Finch tensors. We can also construct a sparse COO Tensor from a list of indices and values using the fsparse function.

Finch.fsparse — Function

fsparse(I::Tuple, V,[ M::Tuple, combine]; fill_value=zero(eltype(V)))

Create a sparse COO tensor S such that size(S) == M and S[(i[q] for i = I)...] = V[q]. The combine function is used to combine duplicates. If M is not specified, it is set to map(maximum, I). If the combine function is not supplied, combine defaults to + unless the elements of V are Booleans in which case combine defaults to |. All elements of I must satisfy 1 <= I[n][q] <= M[n]. Numerical zeros are retained as structural nonzeros; to drop numerical zeros, use dropzeros!.

See also: sparse

Examples

julia> I = ( [1, 2, 3], [1, 2, 3], [1, 2, 3]);

julia> V = [1.0; 2.0; 3.0];

julia> fsparse(I, V) SparseCOO (0.0) [1:3×1:3×1:3] │ │ │ └─└─└─[1, 1, 1] [2, 2, 2] [3, 3, 3] 1.0 2.0 3.0

source

Finch.fsparse! — Function

fsparse!(I..., V,[ M::Tuple])

Like fsparse, but the coordinates must be sorted and unique, and memory is reused.

source

Finch.fsprand — Function

fsprand([rng],[type], M..., p, [rfn])

Create a random sparse tensor of size m in COO format. There are two cases: - If p is floating point, the probability of any element being nonzero is independently given by p (and hence the expected density of nonzeros is also p). - If p is an integer, exactly p nonzeros are distributed uniformly at random throughout the tensor (and hence the density of nonzeros is exactly p / prod(M)). Nonzero values are sampled from the distribution specified by rfn and have the type type. The uniform distribution is used in case rfn is not specified. The optional rng argument specifies a random number generator.

See also: (sprand)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.sprand)

Examples

julia> fsprand(Bool, 3, 3, 0.5)
SparseCOO (false) [1:3,1:3]
├─├─[1, 1]: true
├─├─[3, 1]: true
├─├─[2, 2]: true
├─├─[3, 2]: true
├─├─[3, 3]: true

julia> fsprand(Float64, 2, 2, 2, 0.5)
SparseCOO (0.0) [1:2,1:2,1:2]
├─├─├─[2, 2, 1]: 0.6478553157718558
├─├─├─[1, 1, 2]: 0.996665291437684
├─├─├─[2, 1, 2]: 0.7491940599574348

source

Finch.fspzeros — Function

fspzeros([type], M...)

Create a random zero tensor of size M, with elements of type type. The tensor is in COO format.

See also: (spzeros)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.spzeros)

Examples

julia> A = fspzeros(Bool, 3, 3)
3×3 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{false, Bool, Int64, Vector{Bool}}}}:
 0  0  0
 0  0  0
 0  0  0

julia> countstored(A)
0

julia> B = fspzeros(Float64, 2, 2, 2)
2×2×2 Tensor{SparseCOOLevel{3, Tuple{Int64, Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}, Vector{Int64}}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
[:, :, 1] =
 0.0  0.0
 0.0  0.0

[:, :, 2] =
 0.0  0.0
 0.0  0.0

julia> countstored(B)
0

source

Finch.ffindnz — Function

ffindnz(arr)

Return the nonzero elements of arr, as Finch understands arr. Returns (I..., V), where I are the coordinate vectors, one for each mode of arr, and V is a vector of corresponding nonzero values, which can be passed to fsparse.

See also: (findnz)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.findnz)

source

Fill Values

Finch tensors support an arbitrary "background" value for sparse arrays. While most arrays use 0 as the background value, this is not always the case. For example, a sparse array of Int might use typemin(Int) as the background value. The fill_value function returns the background value of a tensor. If you ever want to change the background value of an existing array, you can use the set_fill_value! function. The countstored function returns the number of stored elements in a tensor, and calling pattern! on a tensor returns tensor which is true whereever the original tensor stores a value. Note that countstored doesn't always return the number of non-zero elements in a tensor, as it counts the number of stored elements, and stored elements may include the background value. You can call dropfills! to remove explicitly stored background values from a tensor.

julia> A = fsparse([1, 1, 2, 3], [2, 4, 5, 6], [1.0, 2.0, 3.0])
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
 0.0  1.0  0.0  2.0  0.0  0.0
 0.0  0.0  0.0  0.0  3.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0

julia> min.(A, -1)
3×6 Tensor{DenseLevel{Int64, DenseLevel{Int64, ElementLevel{-1.0, Float64, Int64, Vector{Float64}}}}}:
 -1.0  -1.0  -1.0  -1.0  -1.0  -1.0
 -1.0  -1.0  -1.0  -1.0  -1.0  -1.0
 -1.0  -1.0  -1.0  -1.0  -1.0  -1.0

julia> fill_value(A)
0.0

julia> B = set_fill_value!(A, -Inf)
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{-Inf, Float64, Int64, Vector{Float64}}}}:
 -Inf    1.0  -Inf    2.0  -Inf   -Inf
 -Inf  -Inf   -Inf  -Inf     3.0  -Inf
 -Inf  -Inf   -Inf  -Inf   -Inf   -Inf

julia> min.(B, -1)
3×6 Tensor{SparseDictLevel{Int64, Vector{Int64}, Vector{Int64}, Vector{Int64}, Dict{Tuple{Int64, Int64}, Int64}, Vector{Int64}, SparseDictLevel{Int64, Vector{Int64}, Vector{Int64}, Vector{Int64}, Dict{Tuple{Int64, Int64}, Int64}, Vector{Int64}, ElementLevel{-Inf, Float64, Int64, Vector{Float64}}}}}:
 -Inf   -1.0  -Inf   -1.0  -Inf   -Inf
 -Inf  -Inf   -Inf  -Inf    -1.0  -Inf
 -Inf  -Inf   -Inf  -Inf   -Inf   -Inf

julia> countstored(A)
3

julia> pattern!(A)
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, PatternLevel{Int64}}}:
 0  1  0  1  0  0
 0  0  0  0  1  0
 0  0  0  0  0  0

Finch.fill_value — Function

fill_value(arr)

Return the initializer for arr. For SparseArrays, this is 0. Often, the "fill" value becomes the "background" value of a tensor.

source

Finch.set_fill_value! — Function

set_fill_value!(fbr, init)

Return a tensor which is equal to fbr, but with the fill (implicit) value set to init. May reuse memory and render the original tensor unusable when modified.

julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
 2.0
 0.0
 3.0
 0.0
 4.0
 0.0
 5.0
 0.0
 6.0
 0.0

julia> set_fill_value!(A, Inf)
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{Inf, Float64, Int64, Vector{Float64}}}}:
  2.0
 Inf
  3.0
 Inf
  4.0
 Inf
  5.0
 Inf
  6.0
 Inf

source

Finch.pattern! — Function

pattern!(fbr)

Return the pattern of fbr. That is, return a tensor which is true wherever fbr is structurally unequal to its fill_value. May reuse memory and render the original tensor unusable when modified.

julia> A = Tensor(SparseList(Element(0.0), 10), [2.0, 0.0, 3.0, 0.0, 4.0, 0.0, 5.0, 0.0, 6.0, 0.0])
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}:
 2.0
 0.0
 3.0
 0.0
 4.0
 0.0
 5.0
 0.0
 6.0
 0.0

julia> pattern!(A)
10 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, PatternLevel{Int64}}}:
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0

source

Finch.countstored — Function

countstored(arr)

Return the number of stored elements in arr. If there are explicitly stored fill elements, they are counted too.

See also: (SparseArrays.nnz)(https://docs.julialang.org/en/v1/stdlib/SparseArrays/#SparseArrays.nnz) and (Base.summarysize)(https://docs.julialang.org/en/v1/base/base/#Base.summarysize)

source

Finch.dropfills — Function

dropfills(src)

Drop the fill values from src and return a new tensor with the same shape and format.

source

Finch.dropfills! — Function

dropfills!(dst, src)

Copy only the non-fill values from src into dst.

source

How to tell whether an entry is "fill"

In the sparse world, a semantic distinction is sometimes made between "explicitly stored" values and "implicit" or "fill" values (usually zero). However, the formats in the Finch compiler represent a diverse set of structures beyond sparsity, and it is often unclear whether any of the values in the tensor are "explicit" (consider a mask matrix, which can be represented with a constant number of bits). Thus, Finch makes no semantic distinction between values which are stored explicitly or not. If users wish to make this distinction, they should instead store a tensor of tuples of the form (value, is_fill). For example,

julia> A = fsparse(
           [1, 1, 2, 3],
           [2, 4, 5, 6],
           [(1.0, false), (0.0, true), (3.0, false)];
           fill_value=(0.0, true),
       )
3×6 Tensor{SparseCOOLevel{2, Tuple{Int64, Int64}, Vector{Int64}, Tuple{Vector{Int64}, Vector{Int64}}, ElementLevel{(0.0, true), Tuple{Float64, Bool}, Int64, Vector{Tuple{Float64, Bool}}}}}:
 (0.0, 1)  (1.0, 0)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)
 (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (3.0, 0)  (0.0, 1)
 (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)

julia> B = Tensor(Dense(SparseList(Element((0.0, true)))), A)
3×6 Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{(0.0, true), Tuple{Float64, Bool}, Int64, Vector{Tuple{Float64, Bool}}}}}}:
 (0.0, 1)  (1.0, 0)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)
 (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (3.0, 0)  (0.0, 1)
 (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)  (0.0, 1)

julia> sum(map(last, B))
16

julia> sum(map(first, B))
4.0

Format Conversion and Storage Order

Converting Between Formats

You can convert between tensor formats with the Tensor constructor. Simply construct a new Tensor in the desired format and

julia> A = Tensor(CSCFormat(), [0 0 2 1; 0 0 1 0; 1 0 0 0])
3×4 Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}:
 0.0  0.0  2.0  1.0
 0.0  0.0  1.0  0.0
 1.0  0.0  0.0  0.0

julia> B = Tensor(DCSCFormat(), A)
3×4 Tensor{SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}:
 0.0  0.0  2.0  1.0
 0.0  0.0  1.0  0.0
 1.0  0.0  0.0  0.0

Storage Order

By default, tensors in Finch are column-major. However, you can use the swizzle function to transpose them lazily. To convert to a transposed format, use the dropfills! function. Note that the permutedims function transposes eagerly.

Finch.swizzle — Function

swizzle(tns, dims)

Create a SwizzleArray to transpose any tensor tns such that

    swizzle(tns, dims)[i...] == tns[i[dims]]

source

julia> A = Tensor(CSCFormat(), [0 0 2 1; 0 0 1 0; 1 0 0 0])
3×4 Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}:
 0.0  0.0  2.0  1.0
 0.0  0.0  1.0  0.0
 1.0  0.0  0.0  0.0

julia> tensor_tree(swizzle(A, 2, 1))
SwizzleArray (2, 1)
└─ 3×4-Tensor
   └─ Dense [:,1:4]
      ├─ [:, 1]: SparseList (0.0) [1:3]
      │  └─ [3]: 1.0
      ├─ [:, 2]: SparseList (0.0) [1:3]
      ├─ [:, 3]: SparseList (0.0) [1:3]
      │  ├─ [1]: 2.0
      │  └─ [2]: 1.0
      └─ [:, 4]: SparseList (0.0) [1:3]
         └─ [1]: 1.0

julia> tensor_tree(permutedims(A, (2, 1)))
4×3-Tensor
└─ SparseDict (0.0) [:,1:3]
   ├─ [:, 1]: SparseDict (0.0) [1:4]
   │  ├─ [3]: 2.0
   │  └─ [4]: 1.0
   ├─ [:, 2]: SparseDict (0.0) [1:4]
   │  └─ [3]: 1.0
   └─ [:, 3]: SparseDict (0.0) [1:4]
      └─ [1]: 1.0

julia> dropfills!(swizzle(Tensor(CSCFormat()), 2, 1), A)
3×4 Finch.SwizzleArray{(2, 1), Tensor{DenseLevel{Int64, SparseListLevel{Int64, Vector{Int64}, Vector{Int64}, ElementLevel{0.0, Float64, Int64, Vector{Float64}}}}}}:
 0.0  0.0  2.0  1.0
 0.0  0.0  1.0  0.0
 1.0  0.0  0.0  0.0