1676646185
Run Stable Diffusion on Mac natively
This app uses Apple's Core ML Stable Diffusion implementation to achieve maximum performance and speed on Apple Silicon based Macs while reducing memory requirements. It also runs on Intel based Macs too.
Download the latest version from the releases page.
When using a model for the very first time, it may take up to 2 minutes for the Neural Engine to compile a cached version. Afterwards, subsequent generations will be much faster.
CPU & Neural Engine
provides a good balance between speed and low memory usageCPU & GPU
may be faster on M1 Max, Ultra and later but will use more memoryDepending on the option chosen, you will need to use the correct model version (see Models section for details).
Intel Macs uses CPU & GPU
as it doesn't have Neural Engine.
You will need to convert or download Core ML models in order to use Mochi Diffusion.
A few models have been converted and uploaded here.
split_einsum
version is compatible with all compute unit options including Neural Engineoriginal
version is only compatible with CPU & GPU
optionDocuments/
└── MochiDiffusion/
└── models/
├── stable-diffusion-2-1_split-einsum_compiled/
│ ├── merges.txt
│ ├── TextEncoder.mlmodelc
│ ├── Unet.mlmodelc
│ ├── VAEDecoder.mlmodelc
│ └── vocab.json
├── ...
└── ...
All generation happens locally and absolutely nothing is sent to the cloud.
Mochi Diffusion is always looking for contributions, whether it's through bug reports, code, or new translations.
If you find a bug, or would like to suggest a new feature or enhancement, try searching for your problem first as it helps avoid duplicates. If you can't find your issue, feel free to create a new issue. Don't create an issue for your question as those are for bugs and feature requests only.
If you're looking to contribute code, feel free to open a Pull Request. I recommend installing SwiftLint to catch lint issues.
If you'd like to translate Mochi Diffusion to your language, please visit the project page on Crowdin. You can create an account for free and start translating and/or approving.
Author: Godly-devotion
Source Code: https://github.com/godly-devotion/MochiDiffusion
License: GPL-3.0 license
1676642160
This is the MobileNet neural network architecture from the paper MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications implemented using Apple's shiny new CoreML framework.
This uses the pretrained weights from shicai/MobileNet-Caffe.
There are two demo apps included:
Cat Demo. Shows the prediction for a cat picture. Open the project in Xcode 9 and run it on a device with iOS 11 or on the simulator.
Camera Demo. Runs from a live video feed and performs a prediction as often as it can manage. (You'll need to run this app on a device, it won't work in the simulator.)
Note: Also check out Forge, my neural net library for iOS 10 that comes with a version of MobileNet implemented in Metal.
The repo already includes a fully-baked MobileNet.mlmodel, so you don't have to follow the steps in this section. However, in case you're curious, here's how I converted the original Caffe model into this .mlmodel file:
Note: You don't have to download mobilenet_deploy.prototxt
. There's already one included in this repo. (I added a Softmax layer at the end, which is missing from the original.)
$ virtualenv -p /usr/bin/python2.7 env
$ source env/bin/activate
$ pip install tensorflow
$ pip install keras==1.2.2
$ pip install coremltools
It's important that you set up the virtual environment using /usr/bin/python2.7
. If you use another version of Python, the conversion script will crash with Fatal Python error: PyThreadState_Get: no current thread
. You also need to use Keras 1.2.2 and not the newer 2.0.
$ python coreml.py
This creates the MobileNet.mlmodel file.
$ deactivate
Done!
Author: Hollance
Source Code: https://github.com/hollance/MobileNet-CoreML
1673767260
Use coremltools to convert machine learning models from third-party libraries to the Core ML format. This Python package contains the supporting tools for converting models from training libraries such as the following:
With coremltools, you can do the following:
After conversion, you can integrate the Core ML models with your app using Xcode.
The coremltools 6 package offers new features to optimize the model conversion process. For details, see New in coremltools.
For a full list of changes, see Release Notes.
To install coremltools 6.0 use the following command:
pip install coremltools
The coremltools 5 package offers several performance improvements over previous versions, including new features. For details, see New in coremltools.
Core ML is an Apple framework to integrate machine learning models into your app. Core ML provides a unified representation for all models. Your app uses Core ML APIs and user data to make predictions, and to fine-tune models, all on the user’s device. Core ML optimizes on-device performance by leveraging the CPU, GPU, and Neural Engine while minimizing its memory footprint and power consumption. Running a model strictly on the user’s device removes any need for a network connection, which helps keep the user’s data private and your app responsive.
To install coremltools, see the “Installation“ page. For more information, see the following:
Author: Apple
Source Code: https://github.com/apple/coremltools
License: BSD-3-Clause license
1668105900
This is a collection of types and functions that make it a little easier to work with Core ML in Swift.
Some of the things CoreMLHelpers has to offer:
CVPixelBuffer
objects and backMLMultiArray
to image conversionExperimental features:
Let me know if there's anything else you'd like to see added to this library!
If Core ML is giving you trouble --- or if you want to learn more about using the Core ML and Vision APIs --- then check out my book Core ML Survival Guide. It has 400+ pages of Core ML tips and tricks.
I wrote the Core ML Survival Guide because the same questions kept coming up on Stack Overflow, on the Apple Developer Forums, and on this GitHub repo. Core ML may appear easy-to-use at first --- but if you want to go beyond the basics, the learning curve suddenly becomes very steep. My goal with this book is to make the advanced features of Core ML accessible to everyone too.
The Core ML Survival Guide currently has over 80 chapters and includes pretty much everything I know about Core ML. As I learn new things I'll keep updating the book, so you'll always have access to the most up-to-date knowledge about Core ML. Cheers!
Copy the source files from the CoreMLHelpers folder into your project. You probably don't need all of them, so just pick the files you require and ignore the rest.
Note: A lot of the code in CoreMLHelpers is only intended as a demonstration of how to approach a certain problem. There's often more than one way to do it. It's quite likely you will need to customize the code for your particular situation, so consider these routines a starting point.
I believe a proper framework should have a well-thought-out API but CoreMLHelpers is a hodgepodge of helper functions that isn't particularly well-organized. Putting this into a package makes things more complicated than necessary. Just copy the one or two source files you need into your project, and adapt them to your needs.
MultiArray
(and fix the bugs!)Author: Hollance
Source Code: https://github.com/hollance/CoreMLHelpers
License: MIT license
1661872620
Since iOS 11, Apple released Core ML framework to help developers integrate machine learning models into applications. The official documentation
We've put up the largest collection of machine learning models in Core ML format, to help iOS, macOS, tvOS, and watchOS developers experiment with machine learning techniques.
If you've converted a Core ML model, feel free to submit a pull request.
Recently, we've included visualization tools. And here's one Netron.
Models
Models that take image data as input and output useful information about the image.
Models that transform images.
Models that process text data
Visualization Tools
Tools that help visualize CoreML Models
Supported formats
List of model formats that could be converted to Core ML with examples
The Gold
Collections of machine learning models that could be converted to Core ML
Individual machine learning models that could be converted to Core ML. We'll keep adjusting the list as they become converted.
Contributing and License
Author: likedan
Source code: https://github.com/likedan/Awesome-CoreML-Models
License: MIT license
1652757015
This project is Object Segmentation on iOS with Core ML.
If you are interested in iOS + Machine Learning, visit here you can see various DEMOs.
DeepLabV3-DEMO1 | FaceParsing-DEMO | DeepLabV3-DEMO-2 | DeepLabV3-DEMO-3 |
---|---|---|---|
![]() | ![]() |
When use Metal
Download model from apple's model page.
Name | Input | Output | Size | iOS version+ | Download |
---|---|---|---|---|---|
DeepLabV3 | Image (Color 513 × 513) | MultiArray (Int32 513 × 513) | 8.6 MB | iOS 12.0+ | link |
DeepLabV3FP16 | Image (Color 513 × 513) | MultiArray (Int32 513 × 513) | 4.3 MB | iOS 12.0+ | link |
DeepLabV3Int8LUT | Image (Color 513 × 513) | MultiArray (Int32 513 × 513) | 2.3 MB | iOS 12.0+ | link |
FaceParsing | Image (Color 512 × 512) | MultiArray (Int32) 512 × 512 | 52.7 MB | iOS 14.0+ | link |
Device | Inference Time | Total Time (GPU) | Total Time (CPU) |
---|---|---|---|
iPhone 12 Pro | 29 ms | 29 ms | 240 ms |
iPhone 12 Pro Max | ⏲ | ⏲ | ⏲ |
iPhone 12 | 30 ms | 31 ms | 253 ms |
iPhone 12 Mini | 29 ms | 30 ms | 226 ms |
iPhone 11 Pro | 39 ms | 40 ms | 290 ms |
iPhone 11 Pro Max | 35 ms | 36 ms | 280 ms |
iPhone 11 | ⏲ | ⏲ | ⏲ |
iPhone SE (2nd) | ⏲ | ⏲ | ⏲ |
iPhone XS Max | ⏲ | ⏲ | ⏲ |
iPhone XS | 54 ms | 55 ms | 327 ms |
iPhone XR | 133 ms | ⏲ | 402 ms |
iPhone X | 137 ms | 143 ms | 376 ms |
iPhone 8+ | 140 ms | 146 ms | 420 ms |
iPhone 8 | 189 ms | ⏲ | 529 ms |
iPhone 7+ | 240 ms | ⏲ | 667 ms |
iPhone 7 | 192 ms | 208 ms | 528 ms |
iPhone 6S + | 309 ms | ⏲ | 1015 ms |
need to measure
Device | Inference Time | Total Time (GPU) | Total Time (CPU) |
---|---|---|---|
iPhone 12 Pro | ⏲ | ⏲ | ⏲ |
iPhone 11 Pro | 37 ms | 37 ms | ⏲ |
# total 21
["background", "aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair",
"cow", "diningtable", "dog", "horse", "motorbike",
"person", "pottedplant", "sheep", "sofa", "train",
"tv"]
# total 19
["background", "skin", "l_brow", "r_brow", "l_eye",
"r_eye", "eye_g", "l_ear", "r_ear", "ear_r",
"nose", "mouth", "u_lip", "l_lip", "neck",
"neck_l", "cloth", "hair", "hat"]
Author: tucan9389
Download Link: Download The Source Code
Official Website: https://github.com/tucan9389/SemanticSegmentation-CoreML
License: MIT license
1633849200
Video contains tutorial on how to use Turi Create tool to train CoreML Object Detection Model using One Shot Learning technique (just 1 sample image). Created model is exported to CoreML format and used in iOS app, which detects this object on provided photos.
Turi Create: https://github.com/apple/turicreate
1624692770
It is well known among deep-learning manias that bilinear upsampling layers in TensorFlow have pixel-offset issues. This has been partly fixed by adding an ‘align_corner’ attribute to them in TensorFlow 2.x. But the problem remains to cause inconsistent computation flow when exporting a trained model in TensorFlow into another DL framework through various versions.
In my case, a neural network model with bilinear upsampling layers showed weird behavior when converting the trained model from TensorFlow 2.5 to Apple Core ML by using coremltools 3.4. After uncountable coding, trials, and delete-delete-delete, I nearly gave up the consistent results of the upsampling layer between TensorFlow and Core ML.
I wanted to use Keras in the latest TensorFlow 2.5 for training in Windows PC, and I wanted to use the previous coremltools 3.4 for converting the trained model to Core ML for my macOS laptop. This is because the version 2.5 has stable Automatic Mixed Precision computing, and because I could not use coremltools 4.x for TF 2.5 due to dependency errors in anaconda and pip in macOS.
Some respectful programmers provide good explanations of this (troublesome) specification defined in TensorFlow. They are helpful for me, and maybe for you too:
#tensorflow #coreml #deep-learning #keras #consistency #consistency of bilinear upsampling layer
1622134020
Core ML is an Apple framework that allows developers to integrate machine learning/deep learning models into their applications. However, it does not support model creation and training, i.e., you first need to create the model in a framework like TensorFlow or PyTorch, then you can convert and use it. There are two ways you can convert your machine learning model from the framework of your choice to the Core ML model format: through an intermediary model format like ONNX or by using Apple’s own CoreMLTools Python library.
Although ONNX works just fine for the conversion, CoreMLTools offers other useful functionalities like model optimization. Also, you’ll need to use CoreMLTools for the final conversion from ONNX format to Core ML format anyway. Currently, it supports the conversion of models created using the following libraries:
#developers corner #apple ai #apple machine learning #core ml3 #coreml #coremltools #model conversion #python libraries
1598166169
I was inspired by this example of Core ML + ARKit. But I found one significant disadvantage — it doesn’t place annotations on objects automatically. Instead, you need to center the object in your camera view and use your finger to place an annotation.
In my opinion, this destroys user expectations, so I decided to fix that and build a more immersive user experience using object detection in augmented reality.
To follow this tutorial, you should be aware of the basics of iOS development using Swift and be familiar (at least somewhat) with Core ML and ARKit. Also, you need an iPhone or iPad with iOS 13+ or iPadOS 13+, respectively.
Out app has two main entities. The first one is the object detection service (shown below): it takes an image as input and returns a bounding box and class label of the recognized object. The second is the ViewController
, the place where all the AR magic happens:
Below are the steps as identified in the inline comments in the code block above
detect
method that instantiates the handler to perform Vision requests on a single image. It uses a Core Video pixel buffer because it can be easily taken from the current ARFrame, since pixel buffer doesn’t store information about the current image orientation. We take the current device orientation and map it into an Exif orientation format.#mobile-machine-learning #coreml #heartbeat #augmented-reality #ios-app-development
1595087700
With the recent new wave of operating system versions (BigSur, iOS 14 etc.), announced at recent WWDC, Apple quite silently introduced a new ML framework to accelerate training of neural networks across the CPU or one or more available GPUs.
ML Compute is not properly a new ML framework but new API’s that utilizes the high performance BNNS primitives made available by the Accelerate framework for the CPU and Metal Performance Shaders for the GPU.
After looking at the documentation and starting to use it on an iOS/macOS application, I understood that this is not really a simple, high level framework but something probably targeting the acceleration of existing third-party ML library like the ONNX Runtime or the TensorFlow Lite framework on the Apple platform.
Even if Apple documentation is pretty good, I would say these APIs are not really developer friendly and swifty for doing generic ML on iOS/macOS. The tensor API, for example, is really rough and it requires dealing with unmanaged pointers in Swift. Basically, you are responsible for managing by yourself the ownership and lifetime of memory allocation of objects like tensors, nodes and graphs that you pass to these API.
More generally, ML Compute does not provide by definition ML APIs like Keras, PyTorch or Swift for TensorFlow to simplify building and training ML models but low level API to build computing graph and manage low level training loop.
For general ML coding on iOS/macOS, I would suggest continuing to use Core ML with tools like CoreMLTools to import model from other framework (TensorFlow, PyTorch etc.) or eventually give a try to the SwiftCoreMLTools library I developed if you want to completely build and/or train model locally on devices avoiding any Python code.
Anyway my personal opinion, after playing with that, is that ML Compute could become potentially really powerful even for regular Swift ML developer adding for example on top of that a Swift Function Builder (DSL) high level API, like the one I developed for SwiftCoreMLTools, and a high level Tensor Swift API, hopefully, integrated with Swift Numerics multi dimensional array.
To quickly test the capability of these APIs, I decided to develop and illustrate a PoC App to train and inference with ML Compute on both iOS and macOS a simple shallow model for the MNIST dataset.
#swift #mlcompute #machine-learning #coreml #gpu #cpu
1595005800
In a previous project, I worked on replicating fast neural style transfer, transforming an image by taking the artistic styling from one image and applying it to another image through deep neural networks. While transforming images in a python notebook works well, it is not very accessible to the average user. I wanted to deploy the model on an iOS device similar to the Prisma app made popular a few years ago. More than that, I wanted to test the limits of a generative model and transform frames on live video feed. The goal of this project was to run a generative model on real time video, exploring what is possible given the current boundaries of the technology. There are a few things that made this possible — 1) scaling inputs, 2) utilizing the device’s GPU, and 3) simplifying the model. Since this builds upon the previous project, some familiarity with the previous post will be helpful.
Many phones today can take stunning 4k videos, including the iPhone XS that I developed on. While the A12 chip in the device is powerful, it would be far too slow to use a deep neural network on every frame of that size. Usually video frames are downscaled for image recognition on devices and the model is run on a subset of frames. For instance, an object recognition app may run a model every second on a 224 x 244 frame, instead of 30 times per second on a 4096 x 2160 frame. That works in an object detection use case, as objects don’t change that much between frames.
This obviously won’t work for stylizing video frames. Having only a single stylized frame flicker every second would not be appealing to a user. However, there are some takeaways from this. First, it is completely reasonable to downscale the frame size. It is common for video to be streamed at 360p and scaled up to the device’s 1080p screen. Second, perhaps running a model on 30 frames per second is not necessary and a slower frame rate would be sufficient.
There is a trade-off between model resolution and frame rate, as there are a limited number of computations the GPU can make in a second. You may see some video chat platforms have a slower frame rate or more buffering when using convolutions for video effects (i.e. changing the background). To get a sense of what different frame rates and input shapes looked like, I created a few stylized videos on a computer with the original neural network and OpenCV. I settled on a goal frame rate of 15 fps with 480 x 853 inputs. I found these to still be visually appealing as well as recognizable numbers for benchmark testing.
I used tfcoreml and coremltools to transform the Tensorflow model to a CoreML model. A gist of the complete method can be found below. There were a couple of considerations with this. First, I moved to batch normalization instead of instance normalization. This was because CoreML does not have an instance normalization layer out of the box, and this simplified the implementation since only one frame would be in each batch at interference time. A custom method could also be used in tfcoreml to convert the instance normalization layer.
#machine-learning #style-transfer #coreml #iphone #deep-learning #deep learning
1593160380
Our sincere thanks for your continued support, readership, and contributions, which have made the last two years truly something special to be a part of.
We’ve grown a lot and undergone a number of exciting changes during that time, but one thing has remained consistent—the incredible blog posts that our contributing writers continue to share with us. We’ve seen so much growth in both those contributors and in the technologies we’re fascinated by: Machine learning, mobile development, and the intersection of the two.
So we want to provide you with a weekly look at some of the incredible new work we’re sharing, some classics that we think are worth revisiting, and some fun surprises along the way. So without further ado, your first issue of The Heartbeat Newsletter.
Happy Reading,
Austin & the Heartbeat Team
#machine-learning #coreml #mobile-machine-learning #snapchat #mobile-app-development
1590913293
#coreml