hollance / coremlhelpers Goto Github PK

View Code? Open in Web Editor NEW

1.4K 40.0 187.0 975 KB

Types and functions that make it a little easier to work with Core ML in Swift.

License: MIT License

Swift 100.00%

coreml core-ml machine-learning swift ios

coremlhelpers's Introduction

Hi there 👋

I'm an audio software developer, working on my own plug-ins as well as freelance for other audio companies.

Most recently I've been employed as a machine learning engineer and as an iOS developer. These days I mostly write C++ audio code. I've been professionally active as a software developer since the mid 1990s.

I have also written and co-written a number of books (most notably The iOS Apprentice and Core ML Survival Guide) and have published my own apps and games as an indie developer.

Creating Synthesizer Plug-Ins with C++ and JUCE — my latest book
My audio programming blog
My machine learning blog

coremlhelpers's People

Contributors

Stargazers

Watchers

Forkers

benjamesbabala ilyi1116 dobaduc hctoolsprojects ferasos jordan4senator alanrabelo vfa-nghiach swifter-ai reddemonnet jianlongf biranchi2018 gregingelmo wuyuebupt hershalle kenmaz howardjin inkimage ph1ps xsgene xuyithu axmav clarenceji allanchen lucas1295santos invinciible feiyuhug iosthuanhn eliiza beniamiiin aprilyoungs doctorcolinsmith t-ae thongvm tarsbase youssrao miguelios ming-chu techxbusiness russzheng franzeyang protez vdumanov kisyov21 herrapa dimatura adrianwiivv broderickhigby madhavajay gaojinw rdv0011 billgert harshbshah chaiyujin cicishredder prabhanjan215 manshiro jasem36 shu223 dirty-dog-software sergeimeza biznitz95 relax94 tamwaiban hogwartsrico moeedkhalid ronak1904 schulte-lukas frspring dkossnick-figma linecode danwbr lionyy hi-guy matrixy ihusnainaali lithiumcarbonate kknet spacejake loganblevins sky9743 pickai kmadiar nohhaklim nativ18 tuskaw longddsf ozgurshn ferdinandl007 pjwelcome linkonbsmrstu miiskin seongjunyoon lightricks zyyang thachonline danhd123 fantimond pranny mumer92

coremlhelpers's Issues

Converting multiarray to grayscale image

I have a coreml model which returns a multi array containing grayscale data.

I printed the coreml output and it is in this format:

(Double 1 x 1 x 1 x 240 x 320 array)

I tried the following codes to convert it to an image but always end up with a black/white output

output.reshaped(to: [1, 240, 320]).image()
output.reshaped(to: [1, 240, 320]).image(min: 0, max: 255, channel: 0, axes: (0,1,2))
output.image(min: 0, max: 255, axes: (0, 1, 2))

This one crashes with index out of range at let width = self.shape[widthAxis].intValue

output.reshaped(to: [240, 320]).image()

Can anyone tell me how to fix this?

Change image to multiArray in case of converted mlmodel from TFlite

Hellow Matthijs,

Your article has helped me a lot, thanks!, But now I'm stuck and need some clues.

I am trying to use a converted '.mlmodel' from Google MediaPipe '.tflite'.

input: change image to multiArray
- with reference to your machinethink.net: How to convert images to MLMultiArray
output: change multiArray to image
- with reference to here CoreMLHelpers: MLMultiArray to image conversion

I'm stuck on changing the input type, I was able to change the type but when I load it into Xcode Project, I get a CoreML compile error.

# convert_inputType.py :convert multiArray to image type
import coremltools as ct
from coremltools.proto import FeatureTypes_pb2 as ft

spec = ct.utils.load_spec('model_coreml_float32.mlmodel') # miltiArray type
builder = ct.models.neural_network.NeuralNetworkBuilder(spec=spec)

# check input/output features
print('--- Before change:')
builder.inspect_input_features()
builder.inspect_output_features()

# change the input so the model can accept 256x256 RGB images
input = spec.description.input[0]
input.type.imageType.colorSpace = ft.ImageFeatureType.RGB
input.type.imageType.width = 256
input.type.imageType.height = 256

# converted input/output features
print('--- After change:')
builder.inspect_input_features()
builder.inspect_output_features()

# save inputType-converted model
ct.utils.save_spec(spec, 'selfie_segmentation.mlmodel') # changed type

--- Before change:

[Id: 0] Name: input_1
          Type: multiArrayType {
  shape: 1
  shape: 256
  shape: 256
  shape: 3
  dataType: FLOAT32
}

[Id: 0] Name: activation_10
          Type: multiArrayType {
  dataType: FLOAT32
}

--- After change:

[Id: 0] Name: input_1
          Type: imageType {
  width: 256
  height: 256
  colorSpace: RGB
}

[Id: 0] Name: activation_10
          Type: multiArrayType {
  dataType: FLOAT32
}

'model_coreml_float32.mlmodel' :converted from MediaPipe TFlite at PINTO_model_zoo
'selfie_segmentation.mlmodel' :changed-type mlmodel

The following error occurs when loading the changed-type mlmodel into Xcode Peoject.

Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast:
----------------------------------------
SchemeBuildError: Failed to build the scheme "testSelfieSegmentation"

compiler error:  Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast:

Compile CoreML model selfie_segmentation.mlmodel:
coremlc: error: compiler error:  Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast:
 (1, 16, 8, 128)
 (1, 16, 2, 128)

I have checked the model configuration with Newtron, and cannot find any nodes like (1,16,8,128) or (1,16,2,128)

Is there something wrong with the change code? Or does it have something to do with the fact that the original is a TFlite model?

Any hints on how to solve this problem?

How to encrypt coreml models

Sorry this question is not related to your CoremlHelpers repo. I have a question as to how to encrypt
coreml models inside ios app, as it's really easy to pick up mlmodelc files from the ipa file, so others can use the valuable model which can harm the business.
Thanks.

Need help to adjust bbox for cropped images

Hello,

I have a feature wherein I allow the users to crop the presets (top, bottom, left, and right). This works fine to get the desired detections. The only issue I encounter is adjusting the bbox coordinates returned by vision.

I am not able to able to figure out the right approach to achieve that. It is more inaccurate when any of the combinations (top+bottom) or (left+right) is applied

Below is the code snippet I am currently using

//imageAreaRect : Rendering frame on device
//topCropRatio : range from 0.0 to 0.30
//bottomCropRatio : range from 0.0 to 0.30
	let width = imageAreaRect.width   
	let height = imageAreaRect.height - ((imageAreaRect.height * CGFloat(topCropRatio)) + (imageAreaRect.height * CGFloat(bottomCropRatio)))
			
let objectBounds = VNImageRectForNormalizedRect(objectBoundingBox.boundingBox, Int(width), Int(height))

//BBOX Adjustment
				if(topCropRatio > 0.0 &&  bottomCropRatio > 0.0){
					adjustY =  -(height+offsetY) - (height * CGFloat( topCropRatio))
					adjustY +=  (height * CGFloat( bottomCropRatio))
					adjustY = adjustY - (objectBounds.size.height*4)
				}
				else
				{
					if( topCropRatio > 0.0){
						adjustY =  -(height+offsetY) - (height * CGFloat( topCropRatio))
						adjustY = adjustY + objectBounds.size.height + 10 //Extra padding
					}
					
					if(info!.pixelBufferConfig.bottomCropRatio > 0.0){
						adjustY =  -(height+offsetY) + (height * CGFloat bottomCropRatio))
						adjustY =  adjustY - objectBounds.size.height
						if(UIDevice.current.orientation.isPortrait){
							adjustY =  adjustY - objectBounds.size.height
						}
						else{
							adjustY =  adjustY + objectBounds.size.height
						}
					}
				}

I had a look at this thread as well apple/turicreate#1016

I am currently using the below technique :

if orientation == .left || orientation == .right || orientation == .rightMirrored || orientation == .leftMirrored {
												req.imageCropAndScaleOption = .scaleFit
											} else {
												req.imageCropAndScaleOption = .scaleFill
											}

Any input will be appreciated. Thanks in advance

resizePixelBuffer does not keep aspect ratio

Hi,
I'm trying to resize a 1280x720 image's pixel buffer to 300x300. But the resulted pixel buffer is like squashed.
What I want is to get an image which has 300 height, its appropriate width and black borders on sides.
Basically it should produce images like this:

How can i archive this?
Thanks

MLMultiArray to Swift Array

Hi @hollance , Is there a way to convert/extract a MLMultiArray to a Swift Array ?

thi swill enable the use of operations such as sort and argsort on the data inside the MLMultiArray .. I could not find anything for sorting a MLMultiArray though?

Best
Aman

PixelBuffer function from iOS (UIImage) to MacOS (NSImage)

The saga for coreML on MacOs continues!

This pixelBuffer function I translated for MacOS (NSImage), but lost on what to use for UIGraphicsPopContext()

If anyone has any input I highly appreciate! : )

iOS/UIImage

   func pixelBuffer(width: Int, height: Int, pixelFormatType: OSType,
                     colorSpace: CGColorSpace, alphaInfo: CGImageAlphaInfo) -> CVPixelBuffer? {
        var maybePixelBuffer: CVPixelBuffer?
        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                     kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
        let status = CVPixelBufferCreate(kCFAllocatorDefault,
                                         width,
                                         height,
                                         pixelFormatType,
                                         attrs as CFDictionary,
                                         &maybePixelBuffer)
        
        guard status == kCVReturnSuccess, let pixelBuffer = maybePixelBuffer else {
            return nil
        }
        
        CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer)
        
        guard let context = CGContext(data: pixelData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: 8,
                                      bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
                                      space: colorSpace,
                                      bitmapInfo: alphaInfo.rawValue)
            else {
                return nil
        }
        
        UIGraphicsPushContext(context)
        context.translateBy(x: 0, y: CGFloat(height))
        context.scaleBy(x: 1, y: -1)
        self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
        UIGraphicsPopContext()
        
        CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        return pixelBuffer
    }

MacOS/NSImage

    func pixelBuffer(width: Int, height: Int, pixelFormatType: OSType,
                     colorSpace: CGColorSpace, alphaInfo: CGImageAlphaInfo) -> CVPixelBuffer? {
        var maybePixelBuffer: CVPixelBuffer?
        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                     kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
        let status = CVPixelBufferCreate(kCFAllocatorDefault,
                                         width,
                                         height,
                                         pixelFormatType,
                                         attrs as CFDictionary,
                                         &maybePixelBuffer)
        
        guard status == kCVReturnSuccess, let pixelBuffer = maybePixelBuffer else {
            return nil
        }
        
        CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer)
        
        guard let context = CGContext(data: pixelData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: 8,
                                      bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
                                      space: colorSpace,
                                      bitmapInfo: alphaInfo.rawValue)
            else {
                return nil
        }
        
        context.saveGState()
        context.translateBy(x: 0, y: CGFloat(height))
        context.scaleBy(x: 1, y: -1)
        self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
         NSGraphicsContext()

      // to do: find an equivalent for NSImage
      //  UIGraphicsPopContext()
      

        CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        return pixelBuffer
    }

Prediction discrepancies between Python's coremltools and Swift's CoreML API

Hi @hollance, is it possible to get the exact numbers from Python's coremltools results in Swift?

That is, When I'm using coremltools (Python), I have the following code:

spec = coremltools.utils.load_spec('./alwc_model.mlmodel')
mlmodel = coremltools.models.MLModel(spec)
input_img = Image.open("input_image.jpg") // width 512, height 512, mode RGB 
prediction = mlmodel.predict({"input_image": input_img}, usesCPUOnly=True)
output = prediction["output"]

In the Xcode project, I have the following code:

let inputImage = #imageLiteral(resourceName: "input_image.jpg")

guard let imageBuffer = inputImage.pixelBuffer(width: 512, height: 512) else {
    let options = MLPredictionOptions()
    options.usesCPUOnly = true
    let model = alwc_model()
    let input = alwc_modelInput(input_image: imageBuffer)
    let prediction = try! model.prediction(input: input, options: options)
    let output = prediction.output
}

Note that I'm using https://github.com/hollance/CoreMLHelpers/blob/0bd7f983c3e73abcffb692b4428e889ed4cf01b6/CoreMLHelpers/UIImage%2BCVPixelBuffer.swift to convert the UIImage into CVPixelBuffer and I'm running on a simulator.

My concern is that while the output is fairly close (off by ~0.02), I'm surprised they are not exactly the same since I believe both of them should be using the macOS's Core ML framework right?

I suspect it could be because of the image conversion step. For example, I found that in coremltools (ref: https://github.com/apple/coremltools/blob/master/coremlpython/CoreMLPythonUtils.mm#L204) its pixelFormatType is kCVPixelFormatType_32BGRA while yours is kCVPixelFormatType_32ARGB. I tried to use kCVPixelFormatType_32BGRA for the pixelBuffer method, but the output in the Xcode project is still not the same as the one from Python.

Do you have experience trying to match the results between Python's coremltools and Swift's CoreML API? Thanks!

UI Image to kCVPixelFormatType_422YpCbCr8FullRange pixel format not working

How to convert UI Image to kCVPixelFormatType_422YpCbCr8FullRange ?

pixelBuffer expects width, height, pixelFormatType, colorSpace and alphaInfo, what should be the appropriate values for these paramaters, especially for colorSpace

MLMultiarray+Image.swift file gives error

Hi Matthijs, I copied this file into my Xcode but line item 220 gives error. Do you have any idea where it comes from?

Xcode Error: Referencing operator function '-' on 'FloatingPoint' requires that 'T' conform to 'FloatingPoint'

Line item 220:
let scaled = (value - min) * T(255) / (max - min)

Swift Package Manager Support?

Would it be possible to add support for SPM?

use with macOS/NSImage

Could you provide a version or suggest edits for use with macOS/NSImage? Would this be a huge project or could this be done by slight modifications?

createPixelBuffer -- RGB or BGR?

In the function createPixelBuffer of CVPixelBuffer+Helpers, the description says the following:

Creates a RGB pixel buffer of the specified width and height.

but in the function there is the following:
let status = CVPixelBufferCreate(nil, width, height, kCVPixelFormatType_32BGRA, nil, &pixelBuffer)

Shouldn't that be kCVPixelFormatType_32RGBA or am I missing something?

Accessing elements from Dim 5 multi array not working

I'm having a Double 1 x 1 x 80 x 1 x 75 array.

when I try to access multiarray[0]. Then the expected output would be an array of shape 1 x 80 x 1 x 75 but I'm getting a single element which is supposed to be the output of multiarray[0, 0, 0, 0, 0]

I used this on another project with Dim 3 and it worked so perfectly!

Thanks for creating this wonderful helpers and I really enjoyed & learned a lot by reading your book.

Create 'createCGImage' from Double

I have a CoreML model I've created which does image segmentation of a single class. It outputs a 2-channel image of size 1024x1024. The first channel is the foreground, the second is background, all of type Double.

I've been trying to use the createCGImage(fromFloatArray: ...) function to convert the model's output to a UIImage since it's so much faster than the previous way I've been trying, but I haven't been able to find a way for it to properly handle the Double data I'm giving it.

If I change the max value to anything less than 255 it just makes a quilt/checkerboard black/white image that doesn't react to any camera input. When the max is 255 it outputs all black.

Any ideas?

UIImage to MLMultiArray ?

Dear Matthijs,
Can you provide a function which can convert an UIImage to MLMultiArray [W, H, C] or [C, W, H]? Where, W = width, H = height and C= channels.
It will be a big help.

Regards,
Pankaj

Is it possible to make the MLMultiArray to image conversion faster via GPU?

Currently, the implementation to convert MLMultiArray to image alloced buffer and uses for loops to fetch and fill the buffer. However it may be faster to run on GPU in parallel? It would be useful if CoreML model is rendered real time during a video stream.

How are images being resized (what interpolation methods)?

I'm going through your README and you show different options of resizing:

One use Vision (which does the resizing)

Or use:
if let pixelBuffer = image.pixelBuffer(width: 224, height: 224)

I'm wondering how are these images being downsized. What interpolation methods are used when downsizing. I realized playing with Coreml in Python that
different interpolation methods affect the accuracy of the model.

Thank you so much for this repository.

How could I reset the pix value dimension from "0-255" to "0-1" ,when I resize the image as a input data?

HI ,Thanks for these tools.
We konw ,many input styles in net.
sometimes ,we convert the image pix values to 0..1 as input data.
and ,we always get the part of the image mat data.
How could I convert the image data matrix value to 0..1 as an image matrix input ?
Thanks .

slicing MultiArray and apply sigmoid OR softmax ?

I have a large MLMultiArray as an output of the coreml model of dimension (1,40,30,90). I convert it to a MultiArray and want to proceed further.

I want to slice it into three subarrays of dimensions (1,40,30,45), (1,40,30,9), (1,40,30,36) ..

Then I would need to apply softmax / sigmoid function to the values in these three arrays ..

Is there a direct way ( e.g. MultiArray. reshape ) for the slicing and then to apply the nonlinearities ?

Your help is much appreciated.

Thanks @hollance .

Crash but not sure is issue

when i receive the camera call back "func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)" from AVCaptureVideoDataOutputSampleBufferDelegate, and i get the piexelBuffer from CMSampleBuffer, and rotate piexelBuffer use "rotate90PixelBuffer", and use this piexelBuffer change to MTLTexture then crash

Maybe: add my ConvertCoordinates helper?

Add support for Vision's imageCropAndScaleOption as an alternative for defining the crop rect by hand.

double 3x300x300 to Color 300x300

Hello @hollance , great job.
I have one question
¿how can I convert my input data before trainning from "double 3x300x300" to "Color 300x300"?
thanks

Question about backprop

Hi Matt,

is it possible to get the gradient from back propagation using core ml or metal? I am trying to implement an adversarial attack on a core ML resnet 50 and don't know how to go about it?

by the way, the tutorials you post are excellent

Resize a CVPixelBuffer from an ARFrame (ARKit frame)

ARKit produces frames (CVPixelBuffer) of size 1280x720. I need to resize the frame to size 640x480.
I try to use the following function:

/**
 Resizes a CVPixelBuffer to a new width and height.
*/
public func resizePixelBuffer(_ pixelBuffer: CVPixelBuffer,
                              width: Int, height: Int) -> CVPixelBuffer?

The result image looks like so:

And this is how I use the function:

let newPixelBuffer = resizePixelBuffer(pixelBuffer, width: 640, height: 480)

Am I doing something wrong?
I printed the runtime object of the original CVPixelBuffer:

<CVPixelBuffer 0x1c012a280 width=1280 height=720 pixelFormat=420f iosurface=0x1c4013600 planes=2>
<Plane 0 width=1280 height=720 bytesPerRow=1280>
<Plane 1 width=640 height=360 bytesPerRow=1280>
<attributes=<CFBasicHash 0x1c4271c00 [0x1b522b310]>{type = immutable dict, count = 1,
entries =>
	0 : <CFString 0x1ae055708 [0x1b522b310]>{contents = "PixelFormatDescription"} = <CFBasicHash 0x1c426d700 [0x1b522b310]>{type = immutable dict, count = 12,
entries =>
	1 : <CFString 0x1ae055d08 [0x1b522b310]>{contents = "FillExtendedPixelsCallback"} = <CFData 0x1c42a40e0 [0x1b522b310]>{length = 24, capacity = 24, bytes = 0x000000000000000048c45687010000000000000000000000}
	5 : <CFString 0x1ae0559c8 [0x1b522b310]>{contents = "ContainsAlpha"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	6 : <CFString 0x1ae055ba8 [0x1b522b310]>{contents = "ContainsYCbCr"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	7 : <CFString 0x1ae055688 [0x1b522b310]>{contents = "OpenGLESCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	9 : <CFString 0x1ae055648 [0x1b522b310]>{contents = "IOSurfaceOpenGLESTextureCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	10 : <CFString 0x1ae055c08 [0x1b522b310]>{contents = "ComponentRange"} = <CFString 0x1ae055c48 [0x1b522b310]>{contents = "FullRange"}
	12 : <CFString 0x1ae055628 [0x1b522b310]>{contents = "IOSurfaceCoreAnimationCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	13 : <CFString 0x1ae055668 [0x1b522b310]>{contents = "IOSurfaceOpenGLESFBOCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	15 : <CFString 0x1ae055968 [0x1b522b310]>{contents = "PixelFormat"} = <CFNumber 0xb000000343230662 [0x1b522b310]>{value = +875704422, type = kCFNumberSInt32Type}
	18 : <CFString 0x1ae055b88 [0x1b522b310]>{contents = "BitsPerComponent"} = <CFNumber 0xb000000000000082 [0x1b522b310]>{value = +8, type = kCFNumberSInt32Type}
	21 : <CFString 0x1ae055bc8 [0x1b522b310]>{contents = "ContainsRGB"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	22 : <CFString 0x1ae0559e8 [0x1b522b310]>{contents = "Planes"} = (
        {
        BitsPerBlock = 8;
        BlackBlock = <00>;
    },
        {
        BitsPerBlock = 16;
        BlackBlock = <8080>;
        HorizontalSubsampling = 2;
        VerticalSubsampling = 2;
    }
)
}

}
 propagatedAttachments=<CFBasicHash 0x1c0667300 [0x1b522b310]>{type = mutable dict, count = 5,
entries =>
	0 : <CFString 0x1ae054fa8 [0x1b522b310]>{contents = "CVImageBufferYCbCrMatrix"} = <CFString 0x1ae054fe8 [0x1b522b310]>{contents = "ITU_R_601_4"}
	1 : <CFString 0x1ae055108 [0x1b522b310]>{contents = "CVImageBufferTransferFunction"} = <CFString 0x1ae054fc8 [0x1b522b310]>{contents = "ITU_R_709_2"}
	2 : <CFString 0x1ae081360 [0x1b522b310]>{contents = "MetadataDictionary"} = <CFBasicHash 0x1c426ba80 [0x1b522b310]>{type = mutable dict, count = 4,
entries =>
	0 : <CFString 0x1ae0893a0 [0x1b522b310]>{contents = "ExposureTime"} = <CFNumber 0x1c402db60 [0x1b522b310]>{value = +0.01000000000000000021, type = kCFNumberFloat64Type}
	1 : <CFString 0x1ae08b340 [0x1b522b310]>{contents = "SNR"} = <CFNumber 0x1c402d9e0 [0x1b522b310]>{value = +35.27033362520548820385, type = kCFNumberFloat64Type}
	5 : <CFString 0x1ae08b360 [0x1b522b310]>{contents = "NormalizedSNR"} = <CFNumber 0x1c402d9c0 [0x1b522b310]>{value = +33.12120558044274787335, type = kCFNumberFloat64Type}
	6 : <CFString 0x1ae08b320 [0x1b522b310]>{contents = "SensorID"} = <CFNumber 0xb000000000002292 [0x1b522b310]>{value = +553, type = kCFNumberSInt32Type}
}

	5 : <CFString 0x1ae055088 [0x1b522b310]>{contents = "CVImageBufferColorPrimaries"} = <CFString 0x1ae054fc8 [0x1b522b310]>{contents = "ITU_R_709_2"}
	6 : <CFString 0x1ae055268 [0x1b522b310]>{contents = "CVImageBufferChromaLocationTopField"} = <CFString 0x1ae0552c8 [0x1b522b310]>{contents = "Center"}
}
 nonPropagatedAttachments=<CFBasicHash 0x1c0667240 [0x1b522b310]>{type = mutable dict, count = 0,
entries =>
}
>

And the resizedPixelBuffer:

<CVPixelBuffer 0x1c012a6e0 width=640 height=480 bytesPerRow=2560 pixelFormat=420f iosurface=0x0 attributes=<CFBasicHash 0x1c066f180 [0x1b522b310]>{type = immutable dict, count = 1,
entries =>
	0 : <CFString 0x1ae055708 [0x1b522b310]>{contents = "PixelFormatDescription"} = <CFBasicHash 0x1c426d700 [0x1b522b310]>{type = immutable dict, count = 12,
entries =>
	1 : <CFString 0x1ae055d08 [0x1b522b310]>{contents = "FillExtendedPixelsCallback"} = <CFData 0x1c42a40e0 [0x1b522b310]>{length = 24, capacity = 24, bytes = 0x000000000000000048c45687010000000000000000000000}
	5 : <CFString 0x1ae0559c8 [0x1b522b310]>{contents = "ContainsAlpha"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	6 : <CFString 0x1ae055ba8 [0x1b522b310]>{contents = "ContainsYCbCr"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	7 : <CFString 0x1ae055688 [0x1b522b310]>{contents = "OpenGLESCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	9 : <CFString 0x1ae055648 [0x1b522b310]>{contents = "IOSurfaceOpenGLESTextureCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	10 : <CFString 0x1ae055c08 [0x1b522b310]>{contents = "ComponentRange"} = <CFString 0x1ae055c48 [0x1b522b310]>{contents = "FullRange"}
	12 : <CFString 0x1ae055628 [0x1b522b310]>{contents = "IOSurfaceCoreAnimationCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	13 : <CFString 0x1ae055668 [0x1b522b310]>{contents = "IOSurfaceOpenGLESFBOCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	15 : <CFString 0x1ae055968 [0x1b522b310]>{contents = "PixelFormat"} = <CFNumber 0xb000000343230662 [0x1b522b310]>{value = +875704422, type = kCFNumberSInt32Type}
	18 : <CFString 0x1ae055b88 [0x1b522b310]>{contents = "BitsPerComponent"} = <CFNumber 0xb000000000000082 [0x1b522b310]>{value = +8, type = kCFNumberSInt32Type}
	21 : <CFString 0x1ae055bc8 [0x1b522b310]>{contents = "ContainsRGB"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	22 : <CFString 0x1ae0559e8 [0x1b522b310]>{contents = "Planes"} = (
        {
        BitsPerBlock = 8;
        BlackBlock = <00>;
    },
        {
        BitsPerBlock = 16;
        BlackBlock = <8080>;
        HorizontalSubsampling = 2;
        VerticalSubsampling = 2;
    }
)
}

}
 propagatedAttachments=<CFBasicHash 0x1c066f200 [0x1b522b310]>{type = mutable dict, count = 0,
entries =>
}
 nonPropagatedAttachments=<CFBasicHash 0x1c066f1c0 [0x1b522b310]>{type = mutable dict, count = 0,
entries =>
}
>

Thanks

How to know if coreML model would perform image-preprocessing step or not?

MultiArray transposed + reshaped

Hello!
a: MultyArray
Shape: 1 x 1 x 48 x 17 x 27 Strides: 22032 x 22032 x 459 x 27 x 1 - INPUT OK
b = a.transposed([0, 1 , 3, 4, 2])
Shape: 1 x 1 x 17 x 27 x 48 Strides: 22032 x 22032 x 27 x 1 x 459 - OK
c = b.reshaped([5508, 4])
Shape: 5508 x 4 Strides: 4 x 1 - NOT OK

Not exactly an issue, but curious about concatenating MultiArrays?

I've noticed that trying to use a structure like [MultiArray<Double>] can be pretty dangerous, leading to EXC_BAD_ACCESS crashes. So, instead of doing this (which isn't ideal anyway), I'd like to concatenate MultiArrays into a larger n+1 dimensional MultiArray. That is, given a series of 4 x [3, 64, 64] MultiArrays, I'd like to create a [4, 3, 64, 64] array, but I'm not sure how to do this.

Currently, one of my [3, 64, 64] MultiArrays is created from an MLMultiArray (prediction) using multiArray = MultiArray<Double>(mlMultiArray).reshaped([3, 64, 64]), and 4 of these (predictions) are added to a [MultiArray<Double>] (but as I said, this seems pretty fragile—and as I recall, has been fragile for me in the past). I tried using something like concatenatedMultiArray[n] = multiArray, where n is an index from 0 to 3, and concatenatedMultiArray is [4, 3, 64, 64], but clearly I'm not able to index MultiArray this way. Any tips?

Q about output Multiarray shape

I ran GAN on coreML model.

My model's output shape is 1 x 1 x 3 x 256 x 256

So I tried to reshape it as 3 x 256 x 256

       let reshapedOutputArray = MultiArray<Double>(output).reshaped([3, 256, 256])
       but, print(reshapedOutputArray) -> Float32 1 x 1 x 3 x 256 x 256 array

So, It couldn't be converted into UIImage.

Can I get some help with this?

pixelBuffer normalization function?

Matt,

Thanks much for the ML library. You always write such elegant code.

Let's assume I have the following, though (as the comment shows), I want to normalize all samples of the pixelBuffer. What is the best means to accomplish this normalization pixel normalization?

let image = UIImage(...)

// Convert the image
if let pixelBuffer = image.pixelBuffer(width: 224, height: 224) {

  // normalize all samples in image.pixelBuffer

  // Make the prediction with Core ML
  if let prediction = try? model.prediction(input: pixelBuffer) {
    print(prediction)
  }
}

[Feature] MLMultiArray -> UIImage without premultiplied alpha

Hi Matthijs:
Thanks for the great tool, it's really helpful!
I'm working on an image to image model, and alpha channel is part of the output.
The model outputs a [4, height, width] MLMultiArray, 4 channels are RGBA. And the value for the RGBA are not premultiplied.
I used to tool to convert the output to UIImage, it works, but because the fromByteArrayRGBA function uses premultipliedLast, so the RGB values in the UIImage are already multipied with Alpha channel.
I tried to add:

@nonobjc public class func fromByteArrayRGBA2(_ bytes: [UInt8],
                                                 width: Int,
                                                 height: Int) -> CGImage? {
      return fromByteArray(bytes, width: width, height: height,
                           bytesPerRow: width * 4,
                           colorSpace: CGColorSpaceCreateDeviceRGB(),
                           alphaInfo: .last)
    }

But it would return a nil when I can .image method.
Can you help me on this question?

Is there any evaluation example code or helper modules on this repo?

I'm looking for evaluating code for Core ML model(Accuracy, Precision, Recall, AUC ROC curve etc.). Is there any evaluation example code or helper modules on this repo?

Thanks :)

pixelBuffer not public

When CoreMLHelpers is used as a Swift Package, some functions cannot be called unless they are public. CGImage+CVPixelBuffer::pixelBuffer is one of those.

Add support for transpose with dataType int and float

can you please add the support for Transpose for the 'in32' and 'float32'

"EXCLUDED_ARCHS[sdk=iphonesimulator*]": "arm64"

can the title be removed in 'FritzCoreMLHelpers.podspec.json'? I just think it may make trouble to iPhone simulator in m1 Mac.apple's suggestion

How to use the transposed function? from (26, 64, 48, 1) to (1, 26, 64, 48)

Hello
I have an MLMultiArray in shape (1, 64, 48, 26) that I need to transpose to shape (1, 26, 64, 48).
How can I do it?

Equivalent python code:

import numpy as np
arr = np.zeros(1*64*48*26).reshape((1, 64, 48, 26))
arr_transposed = arr.transpose((0,3,1,2))

Thanks

CoreGraphics resizing function from iOS (UIImage) to MacOS (NSImage)

Hi Matthias,

Were you able to run your helper functions on MacOS? I replaced iOS (UIImage) to MacOS (NSImage), and got almost everything to work... Only the cgImage.bitsPerPixel part I didn't find a MacOS equivalent.

Have you tried running it on MacOS?

Cheers from Sweden!
Alex

https://stackoverflow.com/questions/67202976/coregraphics-resizing-function-from-ios-uiimage-to-macos-nsimage

What I did:
Code bellow

Problem:
Getting error on this line of code: let bytesPerPixel = cgImage.bitsPerPixel / bitsPerComponent

Error:
Value of type '(UnsafeMutablePointer?, NSGraphicsContext?, [NSImageRep.HintKey : Any]?) -> CGImage?' (aka '(Optional, Optional, Optional<Dictionary<NSImageRep.HintKey, Any>>) -> Optional') has no member 'bitsPerPixel'

iOS: UIImage

extension UIImage {
    
    // Resizeing using CoreGraphics
    func resize(to size:CGSize) -> UIImage? {
        
        let cgImage = self.cgImage!

        let destWidth = Int(size.width)
        let destHeight = Int(size.height)
        let bitsPerComponent = 8
        let bytesPerPixel = cgImage.bitsPerPixel / bitsPerComponent
        let destBytesPerRow = destWidth * bytesPerPixel
        
        let context = CGContext(data: nil,
                                width: destWidth,
                                height: destHeight,
                                bitsPerComponent: bitsPerComponent,
                                bytesPerRow: destBytesPerRow,
                                space: cgImage.colorSpace!,
                                bitmapInfo: cgImage.bitmapInfo.rawValue)!
        context.interpolationQuality = .high
        context.draw(cgImage, in: CGRect(origin: CGPoint.zero, size: size))
        return context.makeImage().flatMap { UIImage(cgImage: $0) }
    }
}

MacOS: NSImage

extension NSImage {
    
    // Resizeing using CoreGraphics
    func resize(to size:CGSize) -> NSImage? {
        
        let cgImage = self.cgImage

        let destWidth = Int(size.width)
        let destHeight = Int(size.height)
        let bitsPerComponent = 8
        let bytesPerPixel = cgImage.bitsPerPixel / bitsPerComponent
        let destBytesPerRow = destWidth * bytesPerPixel
        
        let context = CGContext(data: nil,
                                width: destWidth,
                                height: destHeight,
                                bitsPerComponent: bitsPerComponent,
                                bytesPerRow: destBytesPerRow,
                                space: cgImage.colorSpace!,
                                bitmapInfo: cgImage.bitmapInfo.rawValue)!
        context.interpolationQuality = .high
        context.draw(cgImage, in: CGRect(origin: CGPoint.zero, size: size))
        return context.makeImage().flatMap { NSImage(cgImage: $0) }
    }
}

CVPixelBufferRef exc_bad_access

Hello, as we know, coreml can output a color image (CVPixelbufferRef),but when the coreml model is used in Objective-C project, it always crashed on the output pixelbufferref, saying "exc_bad_access". When used in Swift project, it works well.
I setup a project at Repo, could you help me find out how to fix this problem, as I have to use Objective-C due to some reason.

Custom Model Not Working Correctly in the Application

I noticed that the model is not working properly in my application. While the model works fine in Xcode preview, in the application it either returns the same result with 99% accuracy for each classification or does not provide any result.

In Preview it looks like this:

Predictions:

extension CameraVC : AVCapturePhotoCaptureDelegate {
    
    func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: (any Error)?) {
        
        guard let data = photo.fileDataRepresentation() else {
            return
        }        
        guard let image = UIImage(data: data) else {
            return
        }
            guard let cgImage = image.cgImage else {
                fatalError("Unable to create CIImage")
            }
            let handler = VNImageRequestHandler(cgImage: cgImage,orientation: CGImagePropertyOrientation(image.imageOrientation))
            DispatchQueue.global(qos: .userInitiated).async {
                do {
                    try handler.perform([self.viewModel.detectionRequest])
                } catch {
                    fatalError("Failed to perform detection: \(error)")
                }
   }

lazy var detectionRequest: VNCoreMLRequest = {
        do {
            let model = try VNCoreMLModel(for: bestv720().model)
            let request = VNCoreMLRequest(model: model) { [weak self] request, error in
                self?.processDetections(for: request, error: error)
            }
           request.imageCropAndScaleOption = .centerCrop
            return request
        } catch {
            fatalError("Failed to load Vision ML model: \(error)")
        }
    }()

This is where i print recognized objects:

 func processDetections(for request: VNRequest, error: Error?) {
       DispatchQueue.main.async {
           guard let results = request.results as? [VNRecognizedObjectObservation] else {
               return
           }
           var label = ""
           var all_results = []
           var all_confidence = []
           var true_results = []
           var true_confidence = []
           for result in results {
               for i in 0...results.count{
                   all_results.append(result.labels[i].identifier)
                   all_confidence.append(result.labels[i].confidence)
                   for confidence in all_confidence {
                       if confidence as! Float > 0.7 {
                           true_results.append(result.labels[i].identifier)
                           true_confidence.append(confidence)
                       }
                   }
               }
               label = result.labels[0].identifier
           }
           print("True Results " , true_results)
           print("True Confidence ", true_confidence)
           self.output?.updateView(label:label)
       }
   }

I converted the model like this:

from ultralytics import YOLO
model = YOLO(model_path)
model.export(format='coreml', nms=True, imgsz=[720,1280])

Is there any way to make the returned CVPixelbuffer backed by an IOSurface ?

CoreMLHelpers/CoreMLHelpers/CVPixelBuffer+Helpers.swift

Line 143 in f99c04d

    
           public func rotate90PixelBuffer(_ srcPixelBuffer: CVPixelBuffer, factor: UInt8) -> CVPixelBuffer? {

Is there any way to make the returned CVPixelbuffer backed by an IOSurface ?

Nonsensical conversion error?

This popped up today: Cannot convert value of type 'MultiArray<Double>' to specified type 'MultiArray<Double>'
Unfathomably bizarre. I keep looking at it, expecting to see a difference between those two, but it never changes! Ha! Any thoughts?

Carthage - no shared framework schemes

Chartage update fails with message:

*** Skipped building CoreMLHelpers due to the error:
Dependency "CoreMLHelpers" has no shared framework schemes

My Cartfile:

github "hollance/CoreMLHelpers"

I could not find any .xcworkspace file in the repo, maybe that's the problem, anyway I will try copying the files directly into my project!

Please provide a CocoaPods integration

Hi!
It would be very helpful if you could provide a CocoaPods integration.

Thanks in advance!
Oliver