(Experimental) Parallelization in Finch
Formats
Finch levels usually cannot be updated concurrently from multiple threads. Sparse and structured formats typically store their data buffers contiguously across different columns, making parallel updates difficult to implement correctly and efficiently. However, Finch provides a few specialized concurrent level types, and a few levels which can reinterpret other level formats in a concurrent way.
Finch.AtomicElementLevel — TypeAtomicElementLevel{Vf, [Tv=typeof(Vf)], [Tp=Int], [Val]}()Like an ElementLevel, but updates to the level are performed atomically.
julia> tensor_tree(Tensor(Dense(AtomicElement(0.0)), [1, 2, 3]))
3-Tensor
└─ Dense [1:3]
   ├─ [1]: 1.0
   ├─ [2]: 2.0
   └─ [3]: 3.0Finch.MutexLevel — TypeMutexLevel{Val, Lvl}()Mutex Level Protects the level directly below it with atomics
Each position in the level below the Mutex level is protected by a lock.
julia> tensor_tree(Tensor(Dense(Mutex(Element(0.0))), [1, 2, 3]))
3-Tensor
└─ Dense [1:3]
   ├─ [1]: Mutex ->
   │  └─ 1.0
   ├─ [2]: Mutex ->
   │  └─ 2.0
   └─ [3]: Mutex ->
      └─ 3.0Finch.SeparateLevel — TypeSeparateLevel{Lvl, [Val]}()A subfiber of a Separate level is a separate tensor of type Lvl, in it's own memory space.
Each sublevel is stored in a vector of type Val with eltype(Val) = Lvl.
julia> tensor_tree(Tensor(Dense(Separate(Element(0.0))), [1, 2, 3]))
3-Tensor
└─ Dense [1:3]
   ├─ [1]: Pointer ->
   │  └─ 1.0
   ├─ [2]: Pointer ->
   │  └─ 2.0
   └─ [3]: Pointer ->
      └─ 3.0Parallel Loops
A loop can be run in parallel with a parallel dimension. A dimension can be wrapped in the parallel() modifier to indicate that it should run in parallel.
Finch.parallel — Functionparallel(ext, device=CPU(nthreads()), schedule=static_schedule())A dimension ext that is parallelized over device using the schedule. The ext field is usually _, or dimensionless, but can be any standard dimension argument.
Finch.CPU — TypeCPU(n)A device that represents a CPU with n threads.
Finch.Serial — TypeSerial()A device that represents a serial CPU execution.