Media Technologies

How to use the SpeechDetector Module

I am trying to use SpeechDetector Module in Speech framework along with SpeechTranscriber. and it is giving me an error Cannot convert value of type 'SpeechDetector' to expected element type 'Array.ArrayLiteralElement' (aka 'any SpeechModule') Below is how I am using it let speechDetector = Speech.SpeechDetector() let transcriber = SpeechTranscriber(locale: Locale.current, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange]) speechAnalyzer = try SpeechAnalyzer(modules: [transcriber,speechDetector])

Media Technologies Audio Speech

4

2

453

Aug ’25

Raycasting VNFaceLandmarkRegion2D

Hello, Does anyone have a recipe on how to raycast VNFaceLandmarkRegion2D points obtained from a frame's capturedImage? More specifically, how to construct the "from" parameter of the frame's raycastQuery from a VNFaceLandmarkRegion2D point? Do the points need to be flipped vertically? Is there any other transformation that needs to be performed on the points prior to passing them to raycastQuery?

Media Technologies Photos & Camera ARKit Vision

4

0

320

Sep ’25

AVContentKeySession key renewal on Airplay

Our streaming app uses FairPlay-protected video streams, which previously worked fine when using AVAssetResourceLoaderDelegate to provide CKCs. Recently, we migrated to AVContentKeySession, and while everything works as expected during regular playback, we encountered an issue with AirPlay. Our CKC has a 120-second expiry, so we renew it by calling renewExpiringResponseData.. This trigger the didProvideRenewingContentKeyRequest delegate and we respond with updated CKC. However, when streaming via AirPlay, both video and audio freeze exactly after 120 seconds. To validate the issue, I tested with AVAssetResourceLoaderDelegate and found that I can reproduce the same freeze if I do not renew the key. This suggests that AirPlay is not accepting the renewed CKC when using AVContentKeySession. Additional Details: This issue occurs across different iOS versions and various AirPlay devices. The same content plays without issues when played directly on the device. The renewal process is successful, and segments continue to load, but playback remains frozen. Tried renewing the CKC bit early (100s). I also tried setting player.usesExternalPlaybackWhileExternalScreenIsActive = true, but the issue persists. We don't use persistentKey. Is there anything else that needs to be considered for proper key renewal when AirPlaying? Any help on how to fix this or confirmation if this is a known issue would be greatly appreciated.

Media Technologies Video FairPlay Streaming iOS AirPlay AVFoundation

4

2

673

Mar ’25

Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)

Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }

Media Technologies Audio iOS AVAudioSession AVAudioEngine Push To Talk

4

0

472

Aug ’25

How to reduce CMSampleBuffer volume

Hello, Basically, I am reading and writing an asset. To simplify, I am just reading the asset and rewriting it into an output video without any modifications. However, I want to add a fade-out effect to the last three seconds of the output video. I don’t know how to do this. So far, before adding the CMSampleBuffer to the output video, I tried reducing its volume using an extension on CMSampleBuffer. In the extension, I passed 0.4 for testing, aiming to reduce the video's overall volume by 60%. My question is: How can I directly adjust the volume of a CMSampleBuffer? Here is the extension: extension CMSampleBuffer { func adjustVolume(by factor: Float) -> CMSampleBuffer? { guard let blockBuffer = CMSampleBufferGetDataBuffer(self) else { return nil } var length = 0 var dataPointer: UnsafeMutablePointer<Int8>? guard CMBlockBufferGetDataPointer(blockBuffer, atOffset: 0, lengthAtOffsetOut: nil, totalLengthOut: &length, dataPointerOut: &dataPointer) == kCMBlockBufferNoErr else { return nil } guard let dataPointer = dataPointer else { return nil } let sampleCount = length / MemoryLayout<Int16>.size dataPointer.withMemoryRebound(to: Int16.self, capacity: sampleCount) { pointer in for i in 0..<sampleCount { let sample = Float(pointer[i]) pointer[i] = Int16(sample * factor) } } return self } }

Media Technologies General Swift AVFoundation

4

0

447

May ’25

On iOS 18, Mandarin is read aloud as Cantonese

Please include the line below in follow-up emails for this request. Case-ID: 11089799 When using AVSpeechUtterance and setting it to play in Mandarin, if Siri is set to Cantonese on iOS 18, it will be played in Cantonese. There is no such issue on iOS 17 and 16. 1.let utterance = AVSpeechUtterance(string: textView.text) let voice = AVSpeechSynthesisVoice(language: "zh-CN") utterance.voice = voice 2.In the phone settings, Siri is set to Cantonese

Media Technologies Audio iOS Siri and Voice AVFoundation

4

1

703

2d

PHImageManager.requestImageDataAndOrientation callback is never called

I occasionally receive reports from users that photo import from the Photos library gets stuck and the progress appears to stop indefinitely. I’m using the following APIs: func fetchAsset(_ asset: PHAsset) { let options = PHImageRequestOptions() options.deliveryMode = .highQualityFormat options.resizeMode = .exact options.isSynchronous = false options.isNetworkAccessAllowed = true options.progressHandler = { (progress, error, stop, info) in // 🚨 never called } let requestId = PHImageManager.default().requestImageDataAndOrientation( for: asset, options: options ) { data, _, _, info in // 🚨 never called } } Due to repeated reports, I added detailed logs inside the callback closures. Based on the logs, it looks like the request keeps waiting without any callbacks being invoked — neither the progressHandler nor the completion block of requestImageDataAndOrientation is called. This happens not only with the PHImageManager approach, but also when using PHAsset with PHContentEditingInputRequestOptions — the completion callback is not invoked as well. func fetchAssetByContentEditingInput(_ asset: PHAsset) { let options = PHContentEditingInputRequestOptions() options.isNetworkAccessAllowed = true asset.requestContentEditingInput(with: nil) { contentEditingInput, info in // 🚨 never called } } I suspect this is related to iCloud Photos. Here is what I confirmed from affected users: Using the native picker (My app also provides the native picker as an alternative option for attaching photos), iCloud download proceeds normally and the photo can be attached. However, using the PHImageManager-based approach in my app, the same photo cannot be attached. Even after verifying that the photo has been fully downloaded from iCloud (e.g., by trying “Export Unmodified Originals” in the Photos app as described here: https://support.apple.com/en-us/111762, and confirming the iCloud download progress completed), the callback is still not invoked for that asset. Detailed flow for (1): I asked the user to attach the problematic photo (the one where callbacks never fire) using the native photo picker (UIImagePickerController). The UI showed “Downloading from iCloud” progress. The progress advanced and the photo was attached successfully. Then I asked the user to attach the same photo again using my custom photo picker (which uses the PHImageManager APIs mentioned above). The progress did not advance (No callbacks were invoked). The operation waited indefinitely and never completed. Workaround / current behavior: If I ask users to reboot the device and try again, about 6 out of 10 users can attach successfully afterward. The remaining ~4 out of 10 users still cannot attach even after rebooting. For users who are not fixed immediately after reboot, it seems to resolve naturally after some time. I’ve seen similar reports elsewhere, so I’m wondering if Apple is already aware of an internal issue related to this. If there is any known information, guidance, or recommended workaround, I would appreciate it. I also logged the properties of affected PHAssets (metadata) when the issue occurs, and I can share them below if that helps troubleshooting: [size=3.91MB] [PHAssetMediaSubtype(rawValue: 528)+DepthEffect | userLibrary | (4284x5712) | adjusted=true] [size=3.91MB] [PHAssetMediaSubtype(rawValue: 528)+DepthEffect | userLibrary | (4284x5712) | adjusted=true] [size=2.72MB] [PHAssetMediaSubtype(rawValue: 16)+DepthEffect | userLibrary | (3024x4032) | adjusted=true] [size=2.72MB] [PHAssetMediaSubtype(rawValue: 16)+DepthEffect | userLibrary | (3024x4032) | adjusted=true] [size=2.49MB] [PHAssetMediaSubtype(rawValue: 16)+DepthEffect | userLibrary | (3024x4032) | adjusted=true] [size=2.49MB] [PHAssetMediaSubtype(rawValue: 16)+DepthEffect | userLibrary | (3024x4032) | adjusted=true]

Media Technologies Photos & Camera Photos and Imaging PhotoKit

4

1

302

Jan ’26

Feature / Workaround wanted: Seamless, Automated AirPlay Screen Streaming on visionOS for Demos

Hello Apple team and developer community, I am preparing a visionOS app for a fair environment, where we want to automatically stream the current experience to a nearby monitor via AirPlay, without requiring guests or staff to manually interact with the Control Center or AirPlay pickers all the time. The goal is to provide a smooth, frictionless setup so attendees can focus on the demo, not the configuration. Feature Request: A supported API or method to programmatically start/stop AirPlay video streaming (mirroring or external playback) from within a visionOS app, allowing the current experience to be instantly displayed on an external monitor or Apple TV for the audience. Context & Rationale: In a trade fair or exhibition setting, rapid guest turnaround and minimal staff intervention are crucial. Having to manually guide each visitor through AirPlay setup is impractical. As I understood, AVRoutePickerView can be used for this on iOS/macOS, but this is not available in visionOS. Enabling similar automated streaming on visionOS would make the device far more suitable for live demos and public showcases. Questions: Are there any supported workarounds or best practices for enabling automated screen streaming or AirPlay initiation on visionOS in public demo environments that I missed? Is Apple considering adding programmatic AirPlay control or accessibility features to support such use cases in future visionOS releases? Thank you for considering this request! If there are recommended patterns, entitlements, or accessibility solutions we could explore for trade fair scenarios, your guidance would be greatly appreciated. Best regards, Julian Zürn - IPI, HS Kempten

Media Technologies Streaming AirPlay visionOS

4

0

681

Jan ’26

Camera Capture Extension with AVMultiCamPiP

I am using AVMulti so the user captures two images how can I access those images if there is only one url that stores the captured images for the lockScreenCapture extension ? Plus how can I detect if the user opened the app from the extension to be able to navigate the user to the right screen ?

Media Technologies Photos & Camera AVFoundation

4

0

472

Mar ’25

AVCaptureSession video and audio out of sync

I'm using an AVCaptureSession to send video and audio samples to an AVAssetWriter. When I play back the resultant video, sometimes there is a significant lag between the audio compared with the video, so they're just not in sync. But sometimes they are, with the same code. If I look at the very first presentation time stamps of the buffers being sent to the delegate, via func captureOutput(_: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) I see something like this: Adding audio samples for pts time 227711.0855328798, Adding video samples for pts time 227710.778785374 That is, the clock for audio vs video is behind: the first audio sample I receive is at 11.08 something, while the video video sample is earlier in time, at 10.778 something. The times are the presentation time stamps of the buffer, and the outputPresentationTimeStamp is the exact same number. It feels like "video" vs the "audio" clock are just mismatched. This doesn't always happen: sometimes they're synced. Sometimes they're not. Any ideas? The device I'm recording is a webcam, on iPadOS, connected via the usb-c port.

Media Technologies Video iOS iPadOS

3

0

191

Apr ’25

Photos are captured with incorrect exposure bias in specific scenarios on iPhone 17 Pro

Hey, There seems to be an inconsistency when capturing a photo using QualityPrioritization.Quality on the iPhone 17 Pro Main wide Lens. If you zoom above "2x" the output image always has "-2.0ev" bias in the meta data and looks underexposued. This does not happen at zoom levels above 2, or if you set the QualityPrioritization to .Balanced. See below: with .Quality with .Balanced This does not happen on the other lenses. I'm using a simple set up and it is consistent across JPEG and ProRAW capture. I have a demo project if that is useful. Thanks, Alex

Media Technologies Photos & Camera Camera

3

0

531

Dec ’25

VTFrameRateConversionConfiguration don't support 640x480

hello, I'm using VideoTololbox VTFrameRateConversionConfiguration to perform frame interpolation: https://developer.apple.com/documentation/videotoolbox/vtframerateconversionconfiguration?language=objc ,when using 640x480 vidoe input, I got error: Error ! Invalid configuration [VEEspressoModel] build failure : flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480 [EpsressoModel] Cannot load Net file flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480 Error: failed to create FRCFlowAdaptationFeatureExtractor for usage 8 Failed to switch (0x12c40e140) [usage:8, 1/4 flow:0, adaptation layer:1, twoStage:0, revision:2, flow size (320x240)]. Could not init FlowAdaptation initFlowAdaptationWithError fail tried 2048x1080 is ok.

Media Technologies Video VideoToolbox

3

0

404

Dec ’25

Telephoto Lens Keeps Switching to Other Lenses on iPhone 16 Pro Max During PPG (Finger on Camera)

Hi, I’m building a PPG-based heart rate feature where the user places their finger over the rear telephoto camera. On iPhone 16 Pro Max, I'm explicitly selecting the telephoto lens like this: videoDevice = AVCaptureDevice.default(.builtInTelephotoCamera, for: .video, position: .back) And trying to lock it: if #available(iOS 15.0, *), device.activePrimaryConstituentDeviceSwitchingBehavior != .unsupported { try? device.lockForConfiguration() device.setPrimaryConstituentDeviceSwitchingBehavior(.locked, restrictedSwitchingBehaviorConditions: []) device.unlockForConfiguration() } I also lock everything else to prevent dynamic changes: try device.lockForConfiguration() device.focusMode = .locked device.exposureMode = .locked device.whiteBalanceMode = .locked device.videoZoomFactor = 1.0 device.automaticallyEnablesLowLightBoostWhenAvailable = false device.automaticallyAdjustsVideoHDREnabled = false device.unlockForConfiguration() Despite this, the camera still switches to another lens, especially under different lighting, even though the user’s finger fully covers the lens. Questions: How can I completely prevent lens switching in this scenario? Would using videoZoomFactor = 3.0 or 5.0 better enforce use of the telephoto lens? Thanks! Gal

Media Technologies Photos & Camera Camera

3

0

183

Jul ’25

Save MPEG-TS (h264 or HEVC) video stream using AVAssetWriter.

I'm capturing video stream from GoPro camera (I demux UDP MPEG-TS packets) and create CMSampleBuffers from them, this works fine when I display them using CMSampleBufferLayer. However when I dump them to disk using AVAssetWriter and then playback it with AVPlayer, AVPlayer has problems with scrubbing, it also cannot render previous frames, it needs to go back to key frames. Also thumbnails generated with AVAssetImageGenerator are mostly distorted and green, even though I set the requestedTimeToleranceAfter longer than the key frames frequency. When I re-encode saved video once again with AVAssetExportSession and play it back then I can scrub the video just fine. Is it because re-transcoding adds additional metadata to enable generating frames when rewinding the video and scrubbing? If so is there a way to achieve it with AVAssetWriter without much time penalty? I need the dump/save operation to be very fast. I also considered the following: Instead of de-muxing video and creating CMSampleBuffers, maybe I could directly dump the stream to disk and somehow add moov atoms with timing information. Would this approach work? If so where I can find information how to do it? Thank you!

Media Technologies Video VideoToolbox AVFoundation

3

0

174

Apr ’25

Launch The Main App from LockedCameraCapture

If the app is launched from LockedCameraCapture and if the settings button is tapped, I need to launch the main app. CameraViewController: func settingsButtonTapped() { #if isLockedCameraCaptureExtension //App is launched from Lock Screen //Launch main app here... #else //App is launched from Home Screen self.showSettings(animated: true) #endif } In this document: https://developer.apple.com/documentation/lockedcameracapture/creating-a-camera-experience-for-the-lock-screen Apple asks you to use: func launchApp(with session: LockedCameraCaptureSession, info: String) { Task { do { let activity = NSUserActivityTypeLockedCameraCapture activity.userInfo = [UserInfoKey: info] try await session.openApplication(for: activity) } catch { StatusManager.displayError("Unable to open app - \(error.localizedDescription)") } } } However, the documentation states that this should be placed within the extension code - LockedCameraCapture. If I do that, how can I call that all the way down from the main app's CameraViewController?

Media Technologies Photos & Camera Swift Camera WidgetKit AVFoundation

3

0

566

Nov ’25

PDF Page Content Swapping on iOS 26

Dear Apple Developer Team, On iOS 26, the contents of PDF pages appear to be swapped. Could you please advise if there is a workaround or a planned fix for this issue? Steps to Reproduce: Download the attached PDF on iOS 26. Open the PDF in the Files app. Tap the PDF to view it in Quick Look. Navigate to page 5. Expected Result: The page number displayed at the bottom should be 5. Actual Result: The page number displayed at the bottom is 4. Issue: This is not limited to page 5—multiple page contents appear to be swapped. I have also submitted feedback via Feedback Assistant (FB20743531) on October 20. Best regards, Yoshihito Suezawa

Media Technologies General PDFKit

3

0

397

Nov ’25

SpeechTranscriber on Simulator

I am trying to use SpeechTranscriber from Speech framework. Is it possible to use it on Simulator of iOS 26 (Mac OS Tahoe)? Function "supportedLocales" returns an empty array.

Media Technologies Audio Speech

3

2

935

Nov ’25

Playback Issues for DRM content when sending CMCD

Since iOS and tvOS 18, CMCD can now be automatically sent by AVPlayer (https://developer.apple.com/streaming/Whats-new-HLS.pdf). However, after enabling CMCD, our streams occasionally fail with the following error: CoreMediaErrorDomain Error -17383 This issue appears to affect only DRM-protected (FairPlay) streams so far. We activate CMCD via the resource loader of an AVURLAsset, before assigning the item to an AVPlayer. Unfortunately, we haven’t found a reliable way to reproduce the issue, and we’ve been unable to gather any useful diagnostic information. Has anyone else observed this behavior when enabling CMCD on FairPlay streams?

Media Technologies Streaming FairPlay Streaming iOS HTTP Live Streaming AVFoundation

3

0

519

Oct ’25

Delete songs from playlist via Apple Music API

I use htttps://api.music.apple.com/v1/me/library/playlists/${playlistId}/tracks to add tracks to a playlist I created. How do I DELETE tracks from the playlist? The documentation does not mention a method for this. I have tried calling DELETE methods in various combinations but nothing seems to work. Is this possible?

Media Technologies General Apple Music API MusicKit

3

0

444

Oct ’25

Why Does AVCaptureSessionInterruptionReasonVideoDeviceNotAvailableWithMultipleForegroundApps Occur on iPhone?

Hi everyone, We're encountering an unexpected issue with our iPhone-only camera app: 👉 TimeMark - Photo Proof https://apps.apple.com/us/app/timemark-photo-proof/id6446071834 Problem Description: Our app uses a full-screen camera view via AVCaptureSession. In some cases reported by users, the camera fails immediately upon app launch, and we receive this interruption reason: AVCaptureSessionInterruptionReasonVideoDeviceNotAvailableWithMultipleForegroundApps According to the Apple documentation https://developer.apple.com/documentation/avfoundation/avcapturesession/interruptionreason/videodevicenotavailablewithmultipleforegroundapps?language=objc , this interruption typically occurs when the app is running in a multi-app layout such as Slide Over, Split View, or Picture in Picture — all of which are iPad-only features. However, this issue is being reported on iPhones, and our app does not support iPad at all. Also noted in the documentation: "Given your present AVCaptureSession configuration, the session may only be run if your app occupies the full screen." Additional Context: The issue occurs immediately on app launch, before the user can interact with the camera. We don’t enable multitaskingCameraAccessEnabled. We are 100% sure this is happening on iPhone, not iPad. It’s hard to reproduce; users report it happening sporadically. Locally, we tried playing Picture-in-Picture videos (e.g., Safari/YouTube) before launching our app, but we could not reproduce the issue. Questions: Why is this interruption reason occurring on iPhone, which doesn’t officially support Slide Over or Split View? Could this be caused by some system-level multitasking or resource contention (e.g., Picture in Picture from FaceTime or Safari)? Would enabling multitaskingCameraAccessEnabled help prevent this issue on iPhone, even though it's designed for iPad? Enabling multitaskingCameraAccessEnabled seems to require enabling UIBackgroundModes → voip. Would adding this background mode cause any App Store review risk or rejection if our app doesn't actually use VoIP functionality? Any help, insight, or suggestions would be greatly appreciated. Thanks in advance!

Media Technologies Photos & Camera Camera AVFoundation

3

0

865

Oct ’25

Post

Replies

Boosts

Views

Activity