jjjkkkjjj / matft Goto Github PK
View Code? Open in Web Editor NEWNumpy-like library in swift. (Multi-dimensional Array, ndarray, matrix and vector library)
License: BSD 3-Clause "New" or "Revised" License
Numpy-like library in swift. (Multi-dimensional Array, ndarray, matrix and vector library)
License: BSD 3-Clause "New" or "Revised" License
Thank you for all your hard work on this framework! Will there be support for dot
in the future?
https://numpy.org/doc/stable/reference/generated/numpy.dot.html?highlight=dot#numpy-dot
Numpy version
a = np.zeros([88, 2, 2])
b = np.zeros([32, 88, 2])
print(a[np.arange(88), b[0, :, 0].astype(int), 0].shape) # Outputs to (88,)
Matft version
let a = Matft.nums(0, shape: [88, 2, 2])
let b = Matft.nums(0, shape: [32, 88, 2]).astype(.Int)
print(a[Matft.arange(start: 0, to: 88, by: 1), b[0, Matft.all, 0], 0].shape) // Outputs to [88, 88]
According to Numpy the shape of the MfArray should be [88]
, not [88, 88]
.
Reason
DSPComplex
is the normal struct. Therefore, when I want to forward the this pointer (ptr: UnsafePointer<DSPComplex>
), I must not use +
operator.
let newptr = ptr + 2 // invalid!!
Hi I'm trying to get the values inside the MfArray by indexing. The usage is similar to the item
method of numpy. Wondering what is the best way to do it?
var a = MfArray([[1,2,3], [4, 5, 6]], mftype: .Int32)
a[0, 0] as! Int //works
a = MfArray([1,2,3], mftype: .Int32)
a[0] as! Int //doesn't work. I can do a[0][0][0][0][0][0][0] for infinite number of times
For now I'm converting the MfArray to Array and get the first element as work around
Thanks in advance!
Hi jjjkkkjjj,
Great work on bring Numpy to Swift!!!
I am learning how to do inference within CoreML using Swift.
So far I have gotten the UIImage from an image picker and I need to do preprocessing
e.g. resize, transpose, normalize(mean=(0,0,0), std=(1,1,1))
And after hours and hours searching, Swift just proofed that it is not a language which is friendly for image processing.
And I found your Repo here which has all the amazing feature I need.
So I think it is very helpful if you could add a demo for this.
Cheers
Thanks for great library!
Is there any way to get atan2
from combining 2 MfArray
s? Like this:
let R = MfArray([a, b ,c])
let x = atan2((R[2, 1], R[2, 2])
The issue for us is that we're porting the code from numpy logic, and functionality like that is supported there. E.g. we can retrieve single Double
from the expression like R[2, 1]
so we're struggling now on how to get similar behaviour.
Thanks in advance!
Hello,
Firstly, thank you for working on this! This is really helpful.
Matft.math.pow does not work like np.power().
np.power(a,3) raises each element of a to the third power. That is,
a = np.eye(3) * 3
np.power(a,3)
array([[27., 0., 0.],
[ 0., 27., 0.],
[ 0., 0., 27.]])
The above is returned as a result.
However, Matft.math.pow takes in a float for the first argument, and a matrix for the second, and behaves as attached following:
Hi @jjjkkkjjj . After I did some math transformations in a MfArray
, I want to convert it back to a Swift's Array. What's the most efficient way to do it using your library?
For example, I want to convert the MfArray
back to [Int]
. Right now I'm doing it this way:
let swiftArrayAny = Array(someMfArray).data)
guard let swiftArrayInt32 = swiftArrayAny as? [Int32] else {
fatalError()
}
let swiftArrayInt = swiftArrayInt32.map { Int($0) }
Now, I just have implemented natural cubic spline only. Other boundary condition(clamped, not a knot, periodic) is not supported
Ref: https://github.com/scipy/scipy/blob/v1.5.4/scipy/interpolate/_cubic.py#L464-L847
Hey, I'm just curious if there are any plans to implement "fancy indexing", where you can pass a list of indeces to an MfArray, and return the items at those indeces, like in numpy. Thanks, its a great library so far.
Thank you for providing such a nice library. I've found some cases which might potentially be buggy. Could you help take a look?
Here is the test case
import Matft
final class MatftTest: XCTestCase {
func testMatftCase1() throws {
let a = MfArray([], mftype: .Float)
}
func testMatftCase2() throws {
let b = MfArray([], mftype: .Float, shape: [0])
}
func testMatftCase3() throws {
let a = MfArray([
[1,2,3],
[4,5,6]
], mftype: .Float)
XCTAssertEqual(a[1], MfArray([4,5,6], mftype: .Float))
XCTAssertEqual(a[1].toArray() as! [Float], [4.0,5,6]) //XCTAssertEqual failed: ("[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]") is not equal to ("[4.0, 5.0, 6.0]")
}
}
Thanks!
I didn’t understand a pointer… lol
I must refactor this!!!
Use vDSP_vrampmul and vDSP_vgen instead of Array and Strides?
Fatal error was occurred...
I think this was caused by extracting view's base directly.
let a = Matft.mfarray.arange(start: 0, to: 27*2, by: 2, shape: [3,3,3], mftype: .Double, mforder: .Column)
XCTAssertEqual(a[~-1], MfArray([[[ 0, 18, 36],
[ 6, 24, 42],
[12, 30, 48]],
[[ 2, 20, 38],
[ 8, 26, 44],
[14, 32, 50]]], mftype: .Double))
let b = a[~-1]
XCTAssertEqual(b[~1, ~2], MfArray([[[18, 19, 20],
[21, 22, 23]]], mftype: .Double)) >>>>>>>>Not equal!!
XCTAssertEqual(b[0], MfArray([[18, 19, 20],
[21, 22, 23],
[24, 25, 26]], mftype: .Double)) >>>>>>>>Not equal!!
Hi @jjjkkkjjj
I'm trying to do the following broadcasting, but it is considered as error
in Matft
:
let a = Matft.arange(start: 1, to: 7, by: 1, shape: [3, 2])
let b = Matft.arange(start: 1, to: 5, by: 1, shape: [2, 1, 2])
print(a - b)
Error message:
Fatal error: could not broadcast from shape 3, [2, 1, 2] into shape 3, [1, 3, 2]: file /Users/alwc/Library/Developer/Xcode/DerivedData/testnpy-afxfserzcwmbfdfgfaszjulxzrwa/SourcePackages/checkouts/Matft/Sources/Matft/core/function/conversion.swift, line 290
2020-07-21 19:05:02.261713+0800 testnpy[21523:5245927] Fatal error: could not broadcast from shape 3, [2, 1, 2] into shape 3, [1, 3, 2]: file /Users/alwc/Library/Developer/Xcode/DerivedData/testnpy-afxfserzcwmbfdfgfaszjulxzrwa/SourcePackages/checkouts/Matft/Sources/Matft/core/function/conversion.swift, line 290
In Python this is legal. For example,
x = np.random.randint(10, size=(3, 2))
y = np.random.randint(10, size=(2, 1, 2))
# Returns a shape (2, 3, 2) nd-array
print(x - y)
Could you kindly inform me if there is a way to convert MfArray
to MLMultiArray
, as I have found a method to convert the latter to the former but I am uncertain if there is a similar method for the reverse operation. If such a method doesn't exists, could you please recommend the most effective/performant approach for accomplishing this?
My current approach is
extension MfArray {
func toMLMultiArray() throws -> MLMultiArray {
guard let array = self.astype(.Float).flatten().toArray() as? [Float] else {
//throw some errors here
}
let arrShape = self.shape
let mlShapedArray: MLShapedArray<Float> = MLShapedArray(scalars: array, shape: arrShape)
return MLMultiArray(mlShapedArray)
}
}
Thanks in advance!
Current implementation is copying mfarray in reshaping.
But copying may not be needed.
Create OpenCV Mat by https://stackoverflow.com/questions/39579398/opencv-how-to-create-mat-from-uint8-t-pointer
and pass it by “with” statements.
simple image processing function is vImage
Module
FFFF
means float types (8888
means UInt8)
https://developer.apple.com/documentation/accelerate/1515929-vimageconvolve_argbffff
Adding @inline
may be efficient, and improve the performance
Hello,
I think there is a bug in ufuncReduce
. This code
Matft.ufuncReduce(mfarray: MfArray([1,2,3,4,5,6,7,8,9,10] as [Double]), ufunc: Matft.add)
returns
MfArray([55, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
I would expect that I would get scalar or array of shape [1]. What am I doing wrong please?
If I don’t use generics in MfArray, subscript’s getter and setter must be handled as Any.
However, using Any type causes unexpected error or performance loss.
MfArray<MfType: MfTypable>{
Hoge
}
//Initialization
let a = MfArray<Int>([1,2,3])
//Getter and setter
//Note that scalar will be handled only
subscript(indices: Int...) -> MfType{
hoge
}
//Note that MfArray will be handled only
subscript(mfslices: MfSlice...) -> MfArray{
fuga
}
Hi there
Ive got a few minor implementations of some numpy / scipy functions on vectors in vDSP along with associated XCTests that I think would make good contributions to Matft.
I'm curious how to best / properly implement these into Matft, as it seems like having a single home for them makes sense, and your code seems very well organized.
I'm not entirely sure of the code structure / best place to implement the logic in a generic way that leverages Matfts existing code base.
Do you have any suggestions?
numpy.allclose()
as an extension to Array (without NaN equality)
func allCloseTo(array: [Float], rtol: Float = 1e-5, atol: Float = 1e-8) -> Bool
{
precondition(self.count == array.count, "Arrays must have same size")
let absDiff = vDSP.absolute( vDSP.subtract(self, array) )
let maxAbsDiff = vDSP.maximum(absDiff)
let scaledTol = Swift.max(atol, rtol * vDSP.maximum( vDSP.absolute(self) + vDSP.absolute(array) ) )
return maxAbsDiff <= scaledTol
}
scipy.spatial.distance.cosine
func CosineDistance(_ v1: [Float], _ v2: [Float]) -> Float
{
precondition(v1.count == v2.count, "Arrays must have same size")
var dotProduct: Float = 0.0
var v1Norm: Float = 0.0
var v2Norm: Float = 0.0
let n = vDSP_Length(v1.count)
// Calculate dot product of v1 and v2
vDSP_dotpr(v1, 1, v2, 1, &dotProduct, n)
// Calculate the Euclidean norm of v1
vDSP_svesq(v1, 1, &v1Norm, n)
v1Norm = sqrt(v1Norm)
// Calculate the Euclidean norm of v2
vDSP_svesq(v2, 1, &v2Norm, n)
v2Norm = sqrt(v2Norm)
// Calculate cosine distance
let distance = 1.0 - (dotProduct / (v1Norm * v2Norm))
return distance
}
and scipy.ndimage.gaussian_filter_1d
as array extensions allowing one to cache the computed gaussian kernel.
Note I only really implement the default padding of reflect so far.
static func generateGaussianKernel(sigma:Float, truncate:Float = 4.0) -> [Float]
{
let radius:Int = Int( ceil(truncate * sigma) )
let sigma2 = sigma * sigma
let x:[Float] = Array<Int>( ( -radius ... radius ) ).map { Float( $0 ) }
let x2 = vForce.pow(bases: x, exponents: [Float](repeating: 2.0, count: x.count) )
let y = vDSP.multiply(-0.5 / sigma2, x2)
let phi_x = vForce.exp(y)
return vDSP.divide(phi_x, vDSP.sum(phi_x))
}
enum PaddingMode {
case reflect
case edge
}
private func padInputArray(_ input: [Float], sigma: Float, truncate: Float, paddingMode: PaddingMode) -> [Float] {
var paddedInput = [Float]()
let windowSize = Int(2.0 * sigma * truncate + 1.0)
let padSize = Swift.max(windowSize - input.count, 0)
if padSize > 0
{
switch (paddingMode)
{
case .reflect:
var paddingStart:[Float]
var paddingEnd:[Float]
// If we pad less than our input arrays count, we select what we need from the input array
// This wont be a 'full' pad, as we wont have all items in the array
if padSize <= input.count
{
paddingStart = Array<Float>( input[ 0 ..< Int(padSize)].reversed() )
paddingEnd = Array<Float>( input[ input.count - Int(padSize) ..< input.count].reversed() )
}
// Otherwise, we repeat reflection until we accrue pad size
else
{
paddingStart = input.reversed()
paddingEnd = paddingStart
while paddingStart.count <= padSize
{
paddingStart.insert(contentsOf: paddingStart.reversed(), at: 0)
paddingEnd.append(contentsOf: paddingEnd.reversed())
paddingStart = paddingStart.reversed()
paddingEnd = paddingEnd.reversed()
}
paddingStart = Array<Float>( paddingStart.suffix( Int(sigma * truncate) ) )
paddingEnd = Array<Float>( paddingEnd.prefix( Int(sigma * truncate) ) )
}
paddedInput.append(contentsOf: paddingStart)
paddedInput.append(contentsOf: input)
paddedInput.append(contentsOf: paddingEnd)
break
case .edge:
let edge = input.first ?? 0.0
paddedInput = Array(repeating: edge, count: padSize) + input + Array(repeating: edge, count: padSize)
}
return paddedInput
}
return input
}
// Make sure your Sigma and Truncate values match above:
func gaussianFilter1D(kernel:[Float], sigma:Float, truncate:Float = 4.0, paddingMode:PaddingMode = .reflect) -> [Float]
{
let paddedInput = self.padInputArray(self, sigma:sigma, truncate:truncate, paddingMode:paddingMode)
var output = [Float](repeating: 0.0, count: self.count)
vDSP.convolve(paddedInput, withKernel: kernel, result: &output)
// Technically is this needed, our sum is always 1 ?
// vDSP.divide(output, sigma, result: &output)
// let sum = vDSP.sum(kernel)
// vDSP.multiply(sum, output, result: &output)
return output
}
Hello, I need to write a lot of python (numpy) code in swift and there I found your library. It would be a great help. Is it possible to create MfArrays that have different lengths at axis=1. Here is an example of what I mean:
[
[1, 2, 3, 4],
[1, 2, 3],
[1, 2, 3, 4, 5]
]
If so, how would I need to instantiate the array?
Hi @jjjkkkjjj, whenever the MfArray
is too large, the print description will be incorrect. For example,
let a = Matft.arange(start: 1, to: 40001, by: 1, shape: [40000])
print(a)
/*
mfarray =
[ 1, 2, 3, ..., 39997, 39998, 39999], type=Int, shape=[40000]
*/
print(a[-1])
/*
40000
*/
There’s a difference for handling negative stride between lapack and vDSP.
In vDSP,
var a = [1,2,3,4.0]
var b = [5,6,7,2.0]
var c = [0,0,0,0.0]
vDSP_vaddD(&a, vDSP_Stride(1), &b, vDSP_Stride(-1), &c, vDSP_Stride(1), vDSP_Length(4))
//c -> [6.0, 2.0, 3.0, 4.0]
//cannot add properly!!!
correct one is
vDSP_vaddD(&a, vDSP_Stride(1), &b + 3, vDSP_Stride(-1), &c, vDSP_Stride(1), vDSP_Length(4))
On the other hand, in cblas
cblas_dcopy(Int32(4), &b, Int32(-1), &c, Int32(1))
//c -> [2.0, 7.0, 6.0, 5.0]
//can copy properly!!!
This line (vDSP) must be
let bptr = bptr.baseAddress! + vDSPPrams.b_offset
let sptr = sptr.baseAddress! + vDSPPrams.s_offset
dstptrT = dstptrT + vDSPPrams.b_offset
instead of
let bptr = vDSPPrams.b_offset >= 0 ? bptr.baseAddress! + vDSPPrams.b_offset : bptr.baseAddress! + bigger_mfarray.offsetIndex + vDSPPrams.b_offset
let sptr = vDSPPrams.s_offset >= 0 ? sptr.baseAddress! + vDSPPrams.s_offset : sptr.baseAddress! + smaller_mfarray.offsetIndex + vDSPPrams.s_offset
dstptrT = vDSPPrams.b_offset >= 0 ? dstptrT + vDSPPrams.b_offset : dstptrT + bigger_mfarray.offsetIndex + vDSPPrams.b_offset
cblas must be
let srcptr = cblasPrams.s_stride >= 0 ? srcptr.baseAddress! + cblasPrams.s_offset : srcptr.baseAddress! - mfarray.offsetIndex + cblasPrams.s_offset
let dstptr = cblasPrams.b_stride >= 0 ? dstptr.baseAddress! + cblasPrams.b_offset : dstptr.baseAddress! - dsttmpMfarray.offsetIndex + cblasPrams.b_offset
Hi there
Firstly, this library looks amazing. You've done a ton of work and it looks really promising. Thank you!
I was curious if there was any plans to implement Numpy's FFTs in vDSP? If not, do you do work for hire?
I did some work looking into this myself for work on porting OpenAI's Whisper to CoreML / Accelerate https://github.com/vade/OpenAI-Whisper-CoreML
And I documented some of my findings in this issue here: vade/OpenAI-Whisper-CoreML#1
It seems like Numpy, PyTorch, Rosa / RosaKit all use PocketFFT to do non power of 2 DFTs, which is why the output matches more or less exactly numerically.
PocketFFT doesn't use vDSP, but rather scalar - no simd acceleration.
Given the rest of MatFT's current implementation, doing the STFT and Log Mel work in MatFT would just work. The only missing piece is a numerically equivalent implementation of the Numpy / Torch 'real to complex' (rfft) logic.
Thank you again for all the work on Mattft!
Use vDSP_veqvi(::::::_:) for UnsafeRawPointer.
XNOR=
a|b|ret
0|0|1
0|1|0
1|0|0
1|1|1
I think it is easier for Matft to implement COW than I expected.
Because MfArray has a data class, which is MfData, all we have to do are 2 points. First add “mutating” keyword into conversion method and subscript function. Second check the _isView property in those “mutating” function and then replace the referenced MfData into the new one if the _isView is true.
Regarding #17
Official boolean indexing code is
https://github.com/numpy/numpy/blob/cf1306a842d7b1064270bd06951a485121e60816/numpy/core/src/multiarray/mapping.c#L1010
SIMD function is
https://github.com/numpy/numpy/blob/45bc13e6d922690eea43b9d807d476e0f243f836/numpy/core/src/umath/loops_comparison.dispatch.c.src#L36
let a = MfArray([[1, 3, 5],
[2, -4, -1]], mforder: .Column)
print(a.reshape([3, 1, 2]))
/*
mfarray =
[[[ 1, 2]],
[[ 3, -4]],
[[ 5, -1]]], type=Int, shape=[3, 1, 2]
*/
but must be
array([[[ 1, 3]],
[[ 5, 2]],
[[-4, -1]]])
let a = try! Matft.mfarray.broadcast_to(MfArray([[2, 5, -1],
[3, 1, 0]]), shape: [2,2,2,3])
let b = a[0~, ~1, ~~2]
b[0, ~1] = MfArray([222]) >>>>>>> Precondition failed: -2 is out of bounds for axis 1 with 1: file
subscription arguments [0, ~1]
was passed as Int of Array [0, -2]
...
It's strange
To use vdsp, DSPSplitComplex
seems to be needed according to documents
To use blas package, DSPComplex
will be needed according to this discussion
to achieve to support complex type, using DSPComplex
is ideal?
I must check the difference between DSPComplex
and DSPSplitComplex
at first.
DSPComplex
is the consecutive float values.
Complex data are stored as ordered pairs of floating-point numbers. Because they are stored as ordered pairs, complex vectors require address strides that are multiples of two.
by document
On the other hand, DSPSplitComplex
is stored in different memories
A structure that represents a single-precision complex vector with the real and imaginary parts stored in separate arrays.
by document
So I need to implement the function to connect this difference of memory layout
Hi ! Thanks for open sourcing your code.
Would you mind suggesting the best way to do boolean indexing like numpy ? for example, I can do this in numpy easily
import numpy as np
img = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
], dtype=np.uint8)
img[img > 3] = 10
print(img)
# [[ 1 2 3]
# [10 10 10]
# [10 10 10]]
I tried to do it element-wise but I found the performance is significantly slower than using Swift Array
var testing = Array(repeating: 1, count: 160000)
var bar = MfArray(testing, shape: [400,400])
var start = Date()
for i in 0..<400 {
for j in 0..<400 {
bar[i,j] = bar[i,j] as! Int + 1
}
}
print("\(start.timeIntervalSinceNow * -1) seconds elapsed")
//3.706043004989624 seconds elapsed
start = Date()
for i in 0..<400 {
for j in 0..<400 {
let index1D = i*400+j
testing[index1D] = testing[index1D] + 1
}
}
print("\(start.timeIntervalSinceNow * -1) seconds elapsed")
//0.05165994167327881 seconds elapsed
Thanks !
let img = MfArray([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], mftype: .UInt8)
img[img == 3]
this throws
Referencing operator function '==' on 'BinaryInteger' requires that 'MfArray' conform to 'BinaryInteger'
thank you for the wonderful Swift library -- I have a relatively minor issue where something like MfArray([])
throws a divide by zero error. I fixed it in a fork by changing one line in shape2strides func in mfstructure.swift:
ret[index] = prevAxisNum / max(shape[index],1)
(I simply made the divisor a minimum of 1)
Let me know if I should submit a pull request or if you would like to push the change. Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.