API Diffsから見るiOS 13の新機能 - その他諸々 #WWDC19

Vision編、CoreML公式配布モデル編、Core Image編と、巨大な新API群の端っこの方から記事を書き始めたわけですが、その後どのフレームワークについて書こうとしても調査内容が膨大になってしまってうまくまとまらず、もう眠くなってしまったのでざっくり箇条書きで書いておいて明日以降にまたじっくり調査することにします。

f:id:shu223:20190603103013j:plain

SoundAnalysis

developer.apple.com

Analyze streamed and file-based audio to classify it as a particular type.

うおお、SoundAnalysisめっちゃアツい！画像・動画まわりは機械学習ベースでどんどん進化してるのに音声処理は重要なのにあまり変わらないよなーと最近話してたところ。OCRといい「あったらいいな」がほとんど来てる #iOS13 #WWDC19
— Shuichi Tsutsumi (@shu223) 2019年6月3日

機械学習ベースで音を分類したりするらしい。何の音を分類してくれるのかAPIリファレンスからはわからなかったのだけど、次のドキュメントで、

Analyzing Audio to Classify Sounds | Apple Developer Documentation

The SoundAnalysis framework operates on a model that you’ve trained using a Create ML MLSoundClassifier

と書いてあって、Create MLでつくったCore MLモデルを使って音を判別するようだ。

ってことで、レイヤーとしてはVisionの音声処理版（Visionは画像処理版）と解釈しとくとよさそう。

セマンティック・セグメンテーション / AVSemanticSegmentationMatte

前からしてPortrait Matteの汎用版。Portrait Matteは人間の全身専用マスクだが、もうちょっと汎用的に領域分割するマスクが取れるようになると。

関連メソッドは前回記事にも書いた：

API Diffsから見るiOS 13の新機能 - Core Image #WWDC19 - その後のその後

CoreML公式配布モデルの「DeeplabV3」がMatte抽出にあたって内部で使われているのか、別の仕組みなのかは気になる。

API Diffsから見るiOS 13の新機能 - CoreML公式配布モデル #WWDC19 - その後のその後

AVSemanticSegmentationMatte.MatteTypeという構造体があり、今のところ次のような種類があるようだ。

static let hair: AVSemanticSegmentationMatte.MatteType

A matting image that segments the hair from all people in the visible field of view of an image.

static let skin: AVSemanticSegmentationMatte.MatteType

A matting image that segments the skin from all people in the visible field of view of an image.

static let teeth: AVSemanticSegmentationMatte.MatteType

A matting image that segments the teeth from all people in the visible field of view of an image.

このへんの髪・肌・歯のセグメンテーションについては前回記事にも書いた。

AVCapturePhotoOutputに関連APIがいくつか追加されている。

developer.apple.com

var availableSemanticSegmentationMatteTypes: [AVSemanticSegmentationMatte.MatteType]

An array of semantic segmentation matte types that may be captured and delivered along with the primary photo.

var enabledSemanticSegmentationMatteTypes: [AVSemanticSegmentationMatte.MatteType]

The semantic segmentation matte types that the photo render pipeline delivers.

AVCapturePhotoにも。

func semanticSegmentationMatte(for: AVSemanticSegmentationMatte.MatteType) -> AVSemanticSegmentationMatte?

Retrieves the semantic segmentation matte associated with this photo.

AVCaptureMultiCamSession

developer.apple.com

A subclass of AVCaptureSession that supports simultaneous capture from multiple inputs of the same media type.

"AVMultiCamPiP: Capturing from Multiple Cameras"というサンプルがあるので、あとで実行してみる。

Core Haptics

developer.apple.com

Compose and play haptic patterns to customize your iOS app's haptic feedback.

サンプルもある。

Playing Collision-Based Haptic Patterns | Apple Developer Documentation

このドキュメントも興味深い。"Apple Haptic and Audio Pattern (AHAP)"なるファイルフォーマットがあるらしい。

Representing Haptic Patterns in AHAP Files | Apple Developer Documentation

Audio Effects

そんなに新機能を期待してなかったAudioToolboxのリファレンスを見ていたら、iOS 13+なサンプルが2つもあった。

どちらも"Audio Effects"関連。どのへんがiOS 13なのかまだちゃんと読んでないが、あとで見てみる。

AVSpeechSynthesisVoiceGender

以前からある音声合成機能の、合成音声に男女の区別がついた。

case female
case male
case unspecified

これに伴い、AVSpeechSynthesisVoiceにはgenderプロパティが追加された。

var gender: AVSpeechSynthesisVoiceGender { get }

AVSpeechSynthesizer

なんかちょこちょこと新APIがある。

var mixToTelephonyUplink: Bool
var synthesizerAudioSession: AVAudioSession

func write(AVSpeechUtterance, toBufferCallback: AVSpeechSynthesizer.BufferCallback)

MKPointOfInterestCategory

MKMapViewに次のようなプロパティが追加されていて、

var pointOfInterestFilter: MKPointOfInterestFilter?

地図上の"Point of Interest"をフィルタできる。めちゃくちゃ多くの種類が定義されているので、ここでは触りだけ。

static let airport: MKPointOfInterestCategory

The point of interest category for airports.

static let amusementPark: MKPointOfInterestCategory

The point of interest category for amusement parks.

MKMultiPolygon

MapKitの新クラス。WWDCキーノートを聞きながら

このストリートビューみたいなやつMapKitにAPI追加されてSceneKitやARKitと連携できるようになってたらアツいな
— Shuichi Tsutsumi (@shu223) June 3, 2019

こんな妄想をしてたのもあって、「ポリゴン」という字面からSceneKitと連携するなにかかと一瞬期待したが、よく考えたらMKPolygonというクラスは昔からあって、3Dメッシュ表現手法としてのポリゴンではなくて普通に本来の意味での「多角形」だった。（で、どういうものなのかはまだわかってない）

Create ML

いずれ別記事で書きたいので省略するが、つくれるモデルの種類が増えている。

Create MLって何だっけ？という方はこちらをどうぞ：

shu223.hatenablog.com

VisionKit

Visionとはまた別の新フレームワーク。ドキュメントスキャナ的なものをつくる機能を提供してくれている？

developer.apple.com

VisionKit is a small framework that lets your app use the system's document scanner. Present the document camera as a view controller, which covers the entire screen like the camera function in Notes. Implement the VNDocumentCameraViewControllerDelegate in your own view controller to receive callbacks from the document camera, such as completed scans.

watchOSのIndependent App

これめっちゃいいじゃないですか。今まで2つのApp Extensionで構成されてたのがどうなるのか気になる。プロジェクト生成してみる。

どういうプロジェクト構成になるんだろう？？前はapp extension２つに分かれてたけど。
— Shuichi Tsutsumi (@shu223) June 3, 2019

On-device speech recognition

オンデバイスで音声認識。SFSpeechRecognitionRequestに、以下のプロパティが追加されている。

var requiresOnDeviceRecognition: Bool { get set }

その他もちろん気になるフレームワーク群

（SwiftUIとかCombineとかはもちろんキャッチアップするとして）

ARKit 3
RealityKit
Reality
Core ML 3
Metal
MetalKit
Metal Performance Shaders
Natural Language

etc...