Giter Site home page Giter Site logo

coremlhelpers's Introduction

CoreMLHelpers

This is a collection of types and functions that make it a little easier to work with Core ML in Swift.

Some of the things CoreMLHelpers has to offer:

Experimental features:

Let me know if there's anything else you'd like to see added to this library!

💡 Tip: Get the Core ML Survival Guide

If Core ML is giving you trouble --- or if you want to learn more about using the Core ML and Vision APIs --- then check out my book Core ML Survival Guide. It has 400+ pages of Core ML tips and tricks.

I wrote the Core ML Survival Guide because the same questions kept coming up on Stack Overflow, on the Apple Developer Forums, and on this GitHub repo. Core ML may appear easy-to-use at first --- but if you want to go beyond the basics, the learning curve suddenly becomes very steep. My goal with this book is to make the advanced features of Core ML accessible to everyone too.

The Core ML Survival Guide currently has over 80 chapters and includes pretty much everything I know about Core ML. As I learn new things I'll keep updating the book, so you'll always have access to the most up-to-date knowledge about Core ML. Cheers!

How to use CoreMLHelpers

Copy the source files from the CoreMLHelpers folder into your project. You probably don't need all of them, so just pick the files you require and ignore the rest.

Note: A lot of the code in CoreMLHelpers is only intended as a demonstration of how to approach a certain problem. There's often more than one way to do it. It's quite likely you will need to customize the code for your particular situation, so consider these routines a starting point.

Why can't I use this with Swift Package Manager / Carthage / CocoaPods?

I believe a proper framework should have a well-thought-out API but CoreMLHelpers is a hodgepodge of helper functions that isn't particularly well-organized. Putting this into a package makes things more complicated than necessary. Just copy the one or two source files you need into your project, and adapt them to your needs.

Read more about Core ML

TODO

  • proper unit tests
  • add more numpy-like functionality to MultiArray (and fix the bugs!)

License

CoreMLHelpers is copyright 2017-2021 Matthijs Hollemans and is licensed under the terms of the MIT license.

coremlhelpers's People

Contributors

axmav avatar hollance avatar howardjin avatar ozgurshn avatar shu223 avatar valeriyvan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

coremlhelpers's Issues

Crash but not sure is issue

when i receive the camera call back "func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)" from AVCaptureVideoDataOutputSampleBufferDelegate, and i get the piexelBuffer from CMSampleBuffer, and rotate piexelBuffer use "rotate90PixelBuffer", and use this piexelBuffer change to MTLTexture then crash

Prediction discrepancies between Python's coremltools and Swift's CoreML API

Hi @hollance, is it possible to get the exact numbers from Python's coremltools results in Swift?

That is, When I'm using coremltools (Python), I have the following code:

spec = coremltools.utils.load_spec('./alwc_model.mlmodel')
mlmodel = coremltools.models.MLModel(spec)
input_img = Image.open("input_image.jpg") // width 512, height 512, mode RGB 
prediction = mlmodel.predict({"input_image": input_img}, usesCPUOnly=True)
output = prediction["output"]

In the Xcode project, I have the following code:

let inputImage = #imageLiteral(resourceName: "input_image.jpg")

guard let imageBuffer = inputImage.pixelBuffer(width: 512, height: 512) else {
    let options = MLPredictionOptions()
    options.usesCPUOnly = true
    let model = alwc_model()
    let input = alwc_modelInput(input_image: imageBuffer)
    let prediction = try! model.prediction(input: input, options: options)
    let output = prediction.output
}

Note that I'm using https://github.com/hollance/CoreMLHelpers/blob/0bd7f983c3e73abcffb692b4428e889ed4cf01b6/CoreMLHelpers/UIImage%2BCVPixelBuffer.swift to convert the UIImage into CVPixelBuffer and I'm running on a simulator.

My concern is that while the output is fairly close (off by ~0.02), I'm surprised they are not exactly the same since I believe both of them should be using the macOS's Core ML framework right?

I suspect it could be because of the image conversion step. For example, I found that in coremltools (ref: https://github.com/apple/coremltools/blob/master/coremlpython/CoreMLPythonUtils.mm#L204) its pixelFormatType is kCVPixelFormatType_32BGRA while yours is kCVPixelFormatType_32ARGB. I tried to use kCVPixelFormatType_32BGRA for the pixelBuffer method, but the output in the Xcode project is still not the same as the one from Python.

Do you have experience trying to match the results between Python's coremltools and Swift's CoreML API? Thanks!

Nonsensical conversion error?

This popped up today: Cannot convert value of type 'MultiArray<Double>' to specified type 'MultiArray<Double>'
Unfathomably bizarre. I keep looking at it, expecting to see a difference between those two, but it never changes! Ha! Any thoughts?

MLMultiarray+Image.swift file gives error

Hi Matthijs, I copied this file into my Xcode but line item 220 gives error. Do you have any idea where it comes from?

Xcode Error: Referencing operator function '-' on 'FloatingPoint' requires that 'T' conform to 'FloatingPoint'

Line item 220:
let scaled = (value - min) * T(255) / (max - min)

Need help to adjust bbox for cropped images

Hello,

I have a feature wherein I allow the users to crop the presets (top, bottom, left, and right). This works fine to get the desired detections. The only issue I encounter is adjusting the bbox coordinates returned by vision.

I am not able to able to figure out the right approach to achieve that. It is more inaccurate when any of the combinations (top+bottom) or (left+right) is applied

Below is the code snippet I am currently using

//imageAreaRect : Rendering frame on device
//topCropRatio : range from 0.0 to 0.30
//bottomCropRatio : range from 0.0 to 0.30
	let width = imageAreaRect.width   
	let height = imageAreaRect.height - ((imageAreaRect.height * CGFloat(topCropRatio)) + (imageAreaRect.height * CGFloat(bottomCropRatio)))
			
let objectBounds = VNImageRectForNormalizedRect(objectBoundingBox.boundingBox, Int(width), Int(height))

//BBOX Adjustment
				if(topCropRatio > 0.0 &&  bottomCropRatio > 0.0){
					adjustY =  -(height+offsetY) - (height * CGFloat( topCropRatio))
					adjustY +=  (height * CGFloat( bottomCropRatio))
					adjustY = adjustY - (objectBounds.size.height*4)
				}
				else
				{
					if( topCropRatio > 0.0){
						adjustY =  -(height+offsetY) - (height * CGFloat( topCropRatio))
						adjustY = adjustY + objectBounds.size.height + 10 //Extra padding
					}
					
					if(info!.pixelBufferConfig.bottomCropRatio > 0.0){
						adjustY =  -(height+offsetY) + (height * CGFloat bottomCropRatio))
						adjustY =  adjustY - objectBounds.size.height
						if(UIDevice.current.orientation.isPortrait){
							adjustY =  adjustY - objectBounds.size.height
						}
						else{
							adjustY =  adjustY + objectBounds.size.height
						}
					}
				}

I had a look at this thread as well apple/turicreate#1016

I am currently using the below technique :

if orientation == .left || orientation == .right || orientation == .rightMirrored || orientation == .leftMirrored {
												req.imageCropAndScaleOption = .scaleFit
											} else {
												req.imageCropAndScaleOption = .scaleFill
											}

Any input will be appreciated. Thanks in advance

createPixelBuffer -- RGB or BGR?

In the function createPixelBuffer of CVPixelBuffer+Helpers, the description says the following:

Creates a RGB pixel buffer of the specified width and height.

but in the function there is the following:
let status = CVPixelBufferCreate(nil, width, height, kCVPixelFormatType_32BGRA, nil, &pixelBuffer)

Shouldn't that be kCVPixelFormatType_32RGBA or am I missing something?

How to encrypt coreml models

Sorry this question is not related to your CoremlHelpers repo. I have a question as to how to encrypt
coreml models inside ios app, as it's really easy to pick up mlmodelc files from the ipa file, so others can use the valuable model which can harm the business.
Thanks.

CoreGraphics resizing function from iOS (UIImage) to MacOS (NSImage)

Hi Matthias,

Were you able to run your helper functions on MacOS? I replaced iOS (UIImage) to MacOS (NSImage), and got almost everything to work... Only the cgImage.bitsPerPixel part I didn't find a MacOS equivalent.

Have you tried running it on MacOS?

Cheers from Sweden!
Alex

https://stackoverflow.com/questions/67202976/coregraphics-resizing-function-from-ios-uiimage-to-macos-nsimage

What I did:
Code bellow

Problem:
Getting error on this line of code: let bytesPerPixel = cgImage.bitsPerPixel / bitsPerComponent

Error:
Value of type '(UnsafeMutablePointer?, NSGraphicsContext?, [NSImageRep.HintKey : Any]?) -> CGImage?' (aka '(Optional, Optional, Optional<Dictionary<NSImageRep.HintKey, Any>>) -> Optional') has no member 'bitsPerPixel'

iOS: UIImage

extension UIImage {
    
    // Resizeing using CoreGraphics
    func resize(to size:CGSize) -> UIImage? {
        
        let cgImage = self.cgImage!

        let destWidth = Int(size.width)
        let destHeight = Int(size.height)
        let bitsPerComponent = 8
        let bytesPerPixel = cgImage.bitsPerPixel / bitsPerComponent
        let destBytesPerRow = destWidth * bytesPerPixel
        
        let context = CGContext(data: nil,
                                width: destWidth,
                                height: destHeight,
                                bitsPerComponent: bitsPerComponent,
                                bytesPerRow: destBytesPerRow,
                                space: cgImage.colorSpace!,
                                bitmapInfo: cgImage.bitmapInfo.rawValue)!
        context.interpolationQuality = .high
        context.draw(cgImage, in: CGRect(origin: CGPoint.zero, size: size))
        return context.makeImage().flatMap { UIImage(cgImage: $0) }
    }
}

MacOS: NSImage

extension NSImage {
    
    // Resizeing using CoreGraphics
    func resize(to size:CGSize) -> NSImage? {
        
        let cgImage = self.cgImage

        let destWidth = Int(size.width)
        let destHeight = Int(size.height)
        let bitsPerComponent = 8
        let bytesPerPixel = cgImage.bitsPerPixel / bitsPerComponent
        let destBytesPerRow = destWidth * bytesPerPixel
        
        let context = CGContext(data: nil,
                                width: destWidth,
                                height: destHeight,
                                bitsPerComponent: bitsPerComponent,
                                bytesPerRow: destBytesPerRow,
                                space: cgImage.colorSpace!,
                                bitmapInfo: cgImage.bitmapInfo.rawValue)!
        context.interpolationQuality = .high
        context.draw(cgImage, in: CGRect(origin: CGPoint.zero, size: size))
        return context.makeImage().flatMap { NSImage(cgImage: $0) }
    }
}

pixelBuffer normalization function?

Matt,

Thanks much for the ML library. You always write such elegant code.

Let's assume I have the following, though (as the comment shows), I want to normalize all samples of the pixelBuffer. What is the best means to accomplish this normalization pixel normalization?

let image = UIImage(...)

// Convert the image
if let pixelBuffer = image.pixelBuffer(width: 224, height: 224) {

  // normalize all samples in image.pixelBuffer

  // Make the prediction with Core ML
  if let prediction = try? model.prediction(input: pixelBuffer) {
    print(prediction)
  }
}

Create 'createCGImage' from Double

I have a CoreML model I've created which does image segmentation of a single class. It outputs a 2-channel image of size 1024x1024. The first channel is the foreground, the second is background, all of type Double.

I've been trying to use the createCGImage(fromFloatArray: ...) function to convert the model's output to a UIImage since it's so much faster than the previous way I've been trying, but I haven't been able to find a way for it to properly handle the Double data I'm giving it.

If I change the max value to anything less than 255 it just makes a quilt/checkerboard black/white image that doesn't react to any camera input. When the max is 255 it outputs all black.

Any ideas?

MLMultiArray to Swift Array

Hi @hollance , Is there a way to convert/extract a MLMultiArray to a Swift Array ?

thi swill enable the use of operations such as sort and argsort on the data inside the MLMultiArray .. I could not find anything for sorting a MLMultiArray though?

Best
Aman

Converting multiarray to grayscale image

I have a coreml model which returns a multi array containing grayscale data.
screenshot 2019-03-02 at 15 21 28

I printed the coreml output and it is in this format:

(Double 1 x 1 x 1 x 240 x 320 array)

I tried the following codes to convert it to an image but always end up with a black/white output

output.reshaped(to: [1, 240, 320]).image()
output.reshaped(to: [1, 240, 320]).image(min: 0, max: 255, channel: 0, axes: (0,1,2))
output.image(min: 0, max: 255, axes: (0, 1, 2))

This one crashes with index out of range at let width = self.shape[widthAxis].intValue

output.reshaped(to: [240, 320]).image()

Can anyone tell me how to fix this?

slicing MultiArray and apply sigmoid OR softmax ?

I have a large MLMultiArray as an output of the coreml model of dimension (1,40,30,90). I convert it to a MultiArray and want to proceed further.

I want to slice it into three subarrays of dimensions (1,40,30,45), (1,40,30,9), (1,40,30,36) ..

Then I would need to apply softmax / sigmoid function to the values in these three arrays ..

Is there a direct way ( e.g. MultiArray. reshape ) for the slicing and then to apply the nonlinearities ?

Your help is much appreciated.

Thanks @hollance .

resizePixelBuffer does not keep aspect ratio

Hi,
I'm trying to resize a 1280x720 image's pixel buffer to 300x300. But the resulted pixel buffer is like squashed.
What I want is to get an image which has 300 height, its appropriate width and black borders on sides.
Basically it should produce images like this:
img_0590

How can i archive this?
Thanks

Resize a CVPixelBuffer from an ARFrame (ARKit frame)

ARKit produces frames (CVPixelBuffer) of size 1280x720. I need to resize the frame to size 640x480.
I try to use the following function:

/**
 Resizes a CVPixelBuffer to a new width and height.
*/
public func resizePixelBuffer(_ pixelBuffer: CVPixelBuffer,
                              width: Int, height: Int) -> CVPixelBuffer?

The result image looks like so:

img_1665

And this is how I use the function:

let newPixelBuffer = resizePixelBuffer(pixelBuffer, width: 640, height: 480)

Am I doing something wrong?
I printed the runtime object of the original CVPixelBuffer:

<CVPixelBuffer 0x1c012a280 width=1280 height=720 pixelFormat=420f iosurface=0x1c4013600 planes=2>
<Plane 0 width=1280 height=720 bytesPerRow=1280>
<Plane 1 width=640 height=360 bytesPerRow=1280>
<attributes=<CFBasicHash 0x1c4271c00 [0x1b522b310]>{type = immutable dict, count = 1,
entries =>
	0 : <CFString 0x1ae055708 [0x1b522b310]>{contents = "PixelFormatDescription"} = <CFBasicHash 0x1c426d700 [0x1b522b310]>{type = immutable dict, count = 12,
entries =>
	1 : <CFString 0x1ae055d08 [0x1b522b310]>{contents = "FillExtendedPixelsCallback"} = <CFData 0x1c42a40e0 [0x1b522b310]>{length = 24, capacity = 24, bytes = 0x000000000000000048c45687010000000000000000000000}
	5 : <CFString 0x1ae0559c8 [0x1b522b310]>{contents = "ContainsAlpha"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	6 : <CFString 0x1ae055ba8 [0x1b522b310]>{contents = "ContainsYCbCr"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	7 : <CFString 0x1ae055688 [0x1b522b310]>{contents = "OpenGLESCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	9 : <CFString 0x1ae055648 [0x1b522b310]>{contents = "IOSurfaceOpenGLESTextureCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	10 : <CFString 0x1ae055c08 [0x1b522b310]>{contents = "ComponentRange"} = <CFString 0x1ae055c48 [0x1b522b310]>{contents = "FullRange"}
	12 : <CFString 0x1ae055628 [0x1b522b310]>{contents = "IOSurfaceCoreAnimationCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	13 : <CFString 0x1ae055668 [0x1b522b310]>{contents = "IOSurfaceOpenGLESFBOCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	15 : <CFString 0x1ae055968 [0x1b522b310]>{contents = "PixelFormat"} = <CFNumber 0xb000000343230662 [0x1b522b310]>{value = +875704422, type = kCFNumberSInt32Type}
	18 : <CFString 0x1ae055b88 [0x1b522b310]>{contents = "BitsPerComponent"} = <CFNumber 0xb000000000000082 [0x1b522b310]>{value = +8, type = kCFNumberSInt32Type}
	21 : <CFString 0x1ae055bc8 [0x1b522b310]>{contents = "ContainsRGB"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	22 : <CFString 0x1ae0559e8 [0x1b522b310]>{contents = "Planes"} = (
        {
        BitsPerBlock = 8;
        BlackBlock = <00>;
    },
        {
        BitsPerBlock = 16;
        BlackBlock = <8080>;
        HorizontalSubsampling = 2;
        VerticalSubsampling = 2;
    }
)
}

}
 propagatedAttachments=<CFBasicHash 0x1c0667300 [0x1b522b310]>{type = mutable dict, count = 5,
entries =>
	0 : <CFString 0x1ae054fa8 [0x1b522b310]>{contents = "CVImageBufferYCbCrMatrix"} = <CFString 0x1ae054fe8 [0x1b522b310]>{contents = "ITU_R_601_4"}
	1 : <CFString 0x1ae055108 [0x1b522b310]>{contents = "CVImageBufferTransferFunction"} = <CFString 0x1ae054fc8 [0x1b522b310]>{contents = "ITU_R_709_2"}
	2 : <CFString 0x1ae081360 [0x1b522b310]>{contents = "MetadataDictionary"} = <CFBasicHash 0x1c426ba80 [0x1b522b310]>{type = mutable dict, count = 4,
entries =>
	0 : <CFString 0x1ae0893a0 [0x1b522b310]>{contents = "ExposureTime"} = <CFNumber 0x1c402db60 [0x1b522b310]>{value = +0.01000000000000000021, type = kCFNumberFloat64Type}
	1 : <CFString 0x1ae08b340 [0x1b522b310]>{contents = "SNR"} = <CFNumber 0x1c402d9e0 [0x1b522b310]>{value = +35.27033362520548820385, type = kCFNumberFloat64Type}
	5 : <CFString 0x1ae08b360 [0x1b522b310]>{contents = "NormalizedSNR"} = <CFNumber 0x1c402d9c0 [0x1b522b310]>{value = +33.12120558044274787335, type = kCFNumberFloat64Type}
	6 : <CFString 0x1ae08b320 [0x1b522b310]>{contents = "SensorID"} = <CFNumber 0xb000000000002292 [0x1b522b310]>{value = +553, type = kCFNumberSInt32Type}
}

	5 : <CFString 0x1ae055088 [0x1b522b310]>{contents = "CVImageBufferColorPrimaries"} = <CFString 0x1ae054fc8 [0x1b522b310]>{contents = "ITU_R_709_2"}
	6 : <CFString 0x1ae055268 [0x1b522b310]>{contents = "CVImageBufferChromaLocationTopField"} = <CFString 0x1ae0552c8 [0x1b522b310]>{contents = "Center"}
}
 nonPropagatedAttachments=<CFBasicHash 0x1c0667240 [0x1b522b310]>{type = mutable dict, count = 0,
entries =>
}
>

And the resizedPixelBuffer:

<CVPixelBuffer 0x1c012a6e0 width=640 height=480 bytesPerRow=2560 pixelFormat=420f iosurface=0x0 attributes=<CFBasicHash 0x1c066f180 [0x1b522b310]>{type = immutable dict, count = 1,
entries =>
	0 : <CFString 0x1ae055708 [0x1b522b310]>{contents = "PixelFormatDescription"} = <CFBasicHash 0x1c426d700 [0x1b522b310]>{type = immutable dict, count = 12,
entries =>
	1 : <CFString 0x1ae055d08 [0x1b522b310]>{contents = "FillExtendedPixelsCallback"} = <CFData 0x1c42a40e0 [0x1b522b310]>{length = 24, capacity = 24, bytes = 0x000000000000000048c45687010000000000000000000000}
	5 : <CFString 0x1ae0559c8 [0x1b522b310]>{contents = "ContainsAlpha"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	6 : <CFString 0x1ae055ba8 [0x1b522b310]>{contents = "ContainsYCbCr"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	7 : <CFString 0x1ae055688 [0x1b522b310]>{contents = "OpenGLESCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	9 : <CFString 0x1ae055648 [0x1b522b310]>{contents = "IOSurfaceOpenGLESTextureCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	10 : <CFString 0x1ae055c08 [0x1b522b310]>{contents = "ComponentRange"} = <CFString 0x1ae055c48 [0x1b522b310]>{contents = "FullRange"}
	12 : <CFString 0x1ae055628 [0x1b522b310]>{contents = "IOSurfaceCoreAnimationCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	13 : <CFString 0x1ae055668 [0x1b522b310]>{contents = "IOSurfaceOpenGLESFBOCompatibility"} = <CFBoolean 0x1b522b868 [0x1b522b310]>{value = true}
	15 : <CFString 0x1ae055968 [0x1b522b310]>{contents = "PixelFormat"} = <CFNumber 0xb000000343230662 [0x1b522b310]>{value = +875704422, type = kCFNumberSInt32Type}
	18 : <CFString 0x1ae055b88 [0x1b522b310]>{contents = "BitsPerComponent"} = <CFNumber 0xb000000000000082 [0x1b522b310]>{value = +8, type = kCFNumberSInt32Type}
	21 : <CFString 0x1ae055bc8 [0x1b522b310]>{contents = "ContainsRGB"} = <CFBoolean 0x1b522b878 [0x1b522b310]>{value = false}
	22 : <CFString 0x1ae0559e8 [0x1b522b310]>{contents = "Planes"} = (
        {
        BitsPerBlock = 8;
        BlackBlock = <00>;
    },
        {
        BitsPerBlock = 16;
        BlackBlock = <8080>;
        HorizontalSubsampling = 2;
        VerticalSubsampling = 2;
    }
)
}

}
 propagatedAttachments=<CFBasicHash 0x1c066f200 [0x1b522b310]>{type = mutable dict, count = 0,
entries =>
}
 nonPropagatedAttachments=<CFBasicHash 0x1c066f1c0 [0x1b522b310]>{type = mutable dict, count = 0,
entries =>
}
>

Thanks

Change image to multiArray in case of converted mlmodel from TFlite

Hellow Matthijs,

Your article has helped me a lot, thanks!, But now I'm stuck and need some clues.

I am trying to use a converted '.mlmodel' from Google MediaPipe '.tflite'.

I'm stuck on changing the input type, I was able to change the type but when I load it into Xcode Project, I get a CoreML compile error.

# convert_inputType.py :convert multiArray to image type
import coremltools as ct
from coremltools.proto import FeatureTypes_pb2 as ft

spec = ct.utils.load_spec('model_coreml_float32.mlmodel') # miltiArray type
builder = ct.models.neural_network.NeuralNetworkBuilder(spec=spec)

# check input/output features
print('--- Before change:')
builder.inspect_input_features()
builder.inspect_output_features()

# change the input so the model can accept 256x256 RGB images
input = spec.description.input[0]
input.type.imageType.colorSpace = ft.ImageFeatureType.RGB
input.type.imageType.width = 256
input.type.imageType.height = 256

# converted input/output features
print('--- After change:')
builder.inspect_input_features()
builder.inspect_output_features()

# save inputType-converted model
ct.utils.save_spec(spec, 'selfie_segmentation.mlmodel') # changed type
--- Before change:

[Id: 0] Name: input_1
          Type: multiArrayType {
  shape: 1
  shape: 256
  shape: 256
  shape: 3
  dataType: FLOAT32
}

[Id: 0] Name: activation_10
          Type: multiArrayType {
  dataType: FLOAT32
}

--- After change:

[Id: 0] Name: input_1
          Type: imageType {
  width: 256
  height: 256
  colorSpace: RGB
}

[Id: 0] Name: activation_10
          Type: multiArrayType {
  dataType: FLOAT32
}
  • 'model_coreml_float32.mlmodel' :converted from MediaPipe TFlite at PINTO_model_zoo
  • 'selfie_segmentation.mlmodel' :changed-type mlmodel

The following error occurs when loading the changed-type mlmodel into Xcode Peoject.

Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast:
----------------------------------------
SchemeBuildError: Failed to build the scheme "testSelfieSegmentation"

compiler error:  Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast:

Compile CoreML model selfie_segmentation.mlmodel:
coremlc: error: compiler error:  Espresso exception: "Invalid blob shape": generic_elementwise_kernel: cannot broadcast:
 (1, 16, 8, 128)
 (1, 16, 2, 128)

I have checked the model configuration with Newtron, and cannot find any nodes like (1,16,8,128) or (1,16,2,128)

Is there something wrong with the change code? Or does it have something to do with the fact that the original is a TFlite model?

Any hints on how to solve this problem?

Carthage - no shared framework schemes

Chartage update fails with message:

*** Skipped building CoreMLHelpers due to the error:
Dependency "CoreMLHelpers" has no shared framework schemes

My Cartfile:

github "hollance/CoreMLHelpers"

I could not find any .xcworkspace file in the repo, maybe that's the problem, anyway I will try copying the files directly into my project!

use with macOS/NSImage

Could you provide a version or suggest edits for use with macOS/NSImage? Would this be a huge project or could this be done by slight modifications?

Accessing elements from Dim 5 multi array not working

I'm having a Double 1 x 1 x 80 x 1 x 75 array.

when I try to access multiarray[0]. Then the expected output would be an array of shape 1 x 80 x 1 x 75 but I'm getting a single element which is supposed to be the output of multiarray[0, 0, 0, 0, 0]

I used this on another project with Dim 3 and it worked so perfectly!

Thanks for creating this wonderful helpers and I really enjoyed & learned a lot by reading your book.

Not exactly an issue, but curious about concatenating MultiArrays?

I've noticed that trying to use a structure like [MultiArray<Double>] can be pretty dangerous, leading to EXC_BAD_ACCESS crashes. So, instead of doing this (which isn't ideal anyway), I'd like to concatenate MultiArrays into a larger n+1 dimensional MultiArray. That is, given a series of 4 x [3, 64, 64] MultiArrays, I'd like to create a [4, 3, 64, 64] array, but I'm not sure how to do this.

Currently, one of my [3, 64, 64] MultiArrays is created from an MLMultiArray (prediction) using multiArray = MultiArray<Double>(mlMultiArray).reshaped([3, 64, 64]), and 4 of these (predictions) are added to a [MultiArray<Double>] (but as I said, this seems pretty fragile—and as I recall, has been fragile for me in the past). I tried using something like concatenatedMultiArray[n] = multiArray, where n is an index from 0 to 3, and concatenatedMultiArray is [4, 3, 64, 64], but clearly I'm not able to index MultiArray this way. Any tips?

UIImage to MLMultiArray ?

Dear Matthijs,
Can you provide a function which can convert an UIImage to MLMultiArray [W, H, C] or [C, W, H]? Where, W = width, H = height and C= channels.
It will be a big help.

Regards,
Pankaj

PixelBuffer function from iOS (UIImage) to MacOS (NSImage)

The saga for coreML on MacOs continues!

This pixelBuffer function I translated for MacOS (NSImage), but lost on what to use for UIGraphicsPopContext()

If anyone has any input I highly appreciate! : )

iOS/UIImage

   func pixelBuffer(width: Int, height: Int, pixelFormatType: OSType,
                     colorSpace: CGColorSpace, alphaInfo: CGImageAlphaInfo) -> CVPixelBuffer? {
        var maybePixelBuffer: CVPixelBuffer?
        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                     kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
        let status = CVPixelBufferCreate(kCFAllocatorDefault,
                                         width,
                                         height,
                                         pixelFormatType,
                                         attrs as CFDictionary,
                                         &maybePixelBuffer)
        
        guard status == kCVReturnSuccess, let pixelBuffer = maybePixelBuffer else {
            return nil
        }
        
        CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer)
        
        guard let context = CGContext(data: pixelData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: 8,
                                      bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
                                      space: colorSpace,
                                      bitmapInfo: alphaInfo.rawValue)
            else {
                return nil
        }
        
        UIGraphicsPushContext(context)
        context.translateBy(x: 0, y: CGFloat(height))
        context.scaleBy(x: 1, y: -1)
        self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
        UIGraphicsPopContext()
        
        CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        return pixelBuffer
    }

MacOS/NSImage

    func pixelBuffer(width: Int, height: Int, pixelFormatType: OSType,
                     colorSpace: CGColorSpace, alphaInfo: CGImageAlphaInfo) -> CVPixelBuffer? {
        var maybePixelBuffer: CVPixelBuffer?
        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                     kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
        let status = CVPixelBufferCreate(kCFAllocatorDefault,
                                         width,
                                         height,
                                         pixelFormatType,
                                         attrs as CFDictionary,
                                         &maybePixelBuffer)
        
        guard status == kCVReturnSuccess, let pixelBuffer = maybePixelBuffer else {
            return nil
        }
        
        CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer)
        
        guard let context = CGContext(data: pixelData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: 8,
                                      bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
                                      space: colorSpace,
                                      bitmapInfo: alphaInfo.rawValue)
            else {
                return nil
        }
        
        context.saveGState()
        context.translateBy(x: 0, y: CGFloat(height))
        context.scaleBy(x: 1, y: -1)
        self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
         NSGraphicsContext()

      // to do: find an equivalent for NSImage
      //  UIGraphicsPopContext()
      

        CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        return pixelBuffer
    }

MultiArray transposed + reshaped

Hello!
a: MultyArray
Shape: 1 x 1 x 48 x 17 x 27 Strides: 22032 x 22032 x 459 x 27 x 1 - INPUT OK
b = a.transposed([0, 1 , 3, 4, 2])
Shape: 1 x 1 x 17 x 27 x 48 Strides: 22032 x 22032 x 27 x 1 x 459 - OK
c = b.reshaped([5508, 4])
Shape: 5508 x 4 Strides: 4 x 1 - NOT OK

Question about backprop

Hi Matt,

is it possible to get the gradient from back propagation using core ml or metal? I am trying to implement an adversarial attack on a core ML resnet 50 and don't know how to go about it?

by the way, the tutorials you post are excellent

Q about output Multiarray shape

I ran GAN on coreML model.

My model's output shape is 1 x 1 x 3 x 256 x 256

So I tried to reshape it as 3 x 256 x 256

       let reshapedOutputArray = MultiArray<Double>(output).reshaped([3, 256, 256])
       but, print(reshapedOutputArray) -> Float32 1 x 1 x 3 x 256 x 256 array

So, It couldn't be converted into UIImage.

Can I get some help with this?

[Feature] MLMultiArray -> UIImage without premultiplied alpha

Hi Matthijs:
Thanks for the great tool, it's really helpful!
I'm working on an image to image model, and alpha channel is part of the output.
The model outputs a [4, height, width] MLMultiArray, 4 channels are RGBA. And the value for the RGBA are not premultiplied.
I used to tool to convert the output to UIImage, it works, but because the fromByteArrayRGBA function uses premultipliedLast, so the RGB values in the UIImage are already multipied with Alpha channel.
I tried to add:

@nonobjc public class func fromByteArrayRGBA2(_ bytes: [UInt8],
                                                 width: Int,
                                                 height: Int) -> CGImage? {
      return fromByteArray(bytes, width: width, height: height,
                           bytesPerRow: width * 4,
                           colorSpace: CGColorSpaceCreateDeviceRGB(),
                           alphaInfo: .last)
    }

But it would return a nil when I can .image method.
Can you help me on this question?

CVPixelBufferRef exc_bad_access

Hello, as we know, coreml can output a color image (CVPixelbufferRef),but when the coreml model is used in Objective-C project, it always crashed on the output pixelbufferref, saying "exc_bad_access". When used in Swift project, it works well.
I setup a project at Repo, could you help me find out how to fix this problem, as I have to use Objective-C due to some reason.

How are images being resized (what interpolation methods)?

I'm going through your README and you show different options of resizing:

One use Vision (which does the resizing)

Or use:
if let pixelBuffer = image.pixelBuffer(width: 224, height: 224)

I'm wondering how are these images being downsized. What interpolation methods are used when downsizing. I realized playing with Coreml in Python that
different interpolation methods affect the accuracy of the model.

Thank you so much for this repository.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.