Hello,
We are trying to use the new Background Upload Extension to improve uploads of assets (Photos, Live Photos, Videos) in the background in our application.
1 - We implemented the code to upload still images.
We would like to do the same for Live Photos but we are not sure how to proceed since a Live Photo is composed of 2 resources that we would like to upload in the same job so that we keep the association between the 2
resources.
Steps:
A Live Photo is captured on the device.
Our application is notified for new content in the Photo Library and the asset is queued for upload by our application.
The system calls the background upload extension and the Live Photo is prepared for upload but we can schedule a job only for one resource (still photo or paired video) so we can not upload both resources in a single job.
2 - Is there a way to synchronise the application and the extension so that the application would not process the same data or execute the same requests than the extension. For example: our application is working with
tokens and we would like to prevent those tokens to be consumed by the application and the extension at the same
time.
3 - is there a command in xcode or terminal to start or stop the extension, something similar to what exists for processing tasks
(https://developer.apple.com/documentation/backgroundtasks/starting-and-terminating-tasks-during-development).
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes.
Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
I have an SCStreamDelegate for capturing frames from applications. On recent point releases of macOS Sonoma, I've noticed that the stream is being cancelled with no user action being taken. I started trying to debug it and when my on error method is called, the error parameter being passed is null:
func stream(_ stream: SCStream, didStopWithError error: Error) {
/*debugger shows this and segfaults if I try to print "\(error)"
error (Error)
> error = (Builtin.RawPointer) 0x0
*/
From what I can tell, error should be a valid NSError so I can check the error code, based on similar code I've seen in, for example OBS (https://github.com/obsproject/obs-studio/blob/265239d4174f8d291b0de437088c5b78f8e27687/plugins/mac-capture/mac-sck-common.m#L29)
Usually when this happens, the menubar icon for screen sharing (where I would click to change sharing window, etc) stays there even after my app has closed an no apps are doing sharing stuff.
Has anyone come across this before? Am I misinterpreting what the debugger is saying about the error parameter?
I'm running macos 14.7.3, but I just updated from 14.7.2 earlier and had basically the same issue on both macos versions
In iOS 26, AVSpeechSynthesizer read Mandarin into Cantonese pronunciation.
No matter how you set the language, and change the settings of my phone system, it doesn't work.
let utterance = AVSpeechUtterance(string: "你好啊")
//let voice = AVSpeechSynthesisVoice(language: "zh-CN") // not work
let voice = AVSpeechSynthesisVoice(language: "zh-Hans") // not work too
utterance.voice = voice
et synth = AVSpeechSynthesizer()
synth.speak(utterance)
Topic:
Media Technologies
SubTopic:
General
Tags:
Speech
Internationalization
Localization
AVFoundation
Is it possible to stream video from a UVC (USB Video Class) camera on an iPhone 15? If so, are there any specific hardware or software requirements to enable this functionality?
I am unable to find any clearcut documentation on configuring AVCaptureSession pipeline to capture video with proResRAW codec type, which is 16 bit format. Is it supported only with AVCaptureMovieFileOutput or one can have AVCaptureVideoDataOutput emitting 16-bit sample buffers that can be vended to AVAssetWriter?
I’m using the shared instance of AVAudioSession. After activating it with .setActive(true), I observe the outputVolume, and it correctly reports the device’s volume.
However, after deactivating the session using .setActive(false), changing the volume, and then reactivating it again, the outputVolume returns the previous volume (before deactivation), not the current device volume. The correct volume is only reported after the user manually changes it again using physical buttons or Control Center, which triggers the observer.
What I need is a way to retrieve the actual current device volume immediately after reactivating the audio session, even on the second and subsequent activations.
Disabling and re-enabling the audio session is essential to how my application functions.
I’ve tested this behavior with my colleagues, and the issue is consistently reproducible on iOS 18.0.1, iOS 18.1, iOS 18.3, iOS 18.5 and iOS 18.6.2. On devices running iOS 17.6.1 and iOS 16.0.3, outputVolume correctly reflects the current volume immediately after calling .setActive(true) multiple times.
Hi,
I'm currently developping an AVB hardware device, and I'm currently stuck because because the apple AVB stack is throwing me errors without much informations.
Is there any way to have more information about these assertions and why they are happening ?
Furtermore is there any documentation on theAppleAVBAudio module ? It would be very handy
Here are the logs shown in the console:
Filtering the log data using "process == "coreaudiod""
Timestamp Thread Type Activity PID TTL
2025-12-05 15:44:27.087043+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.087545+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.088043+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.088546+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.089043+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.089545+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.090043+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.090545+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.091043+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.091545+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.092044+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.092544+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.093044+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.093552+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.094050+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
2025-12-05 15:44:27.094543+0100 0x15ae74 Default 0x0 12965 0 coreaudiod: (AppleAVBAudio) Assert: <private> (value 0x0 0), <private> file: <private>, line: 1533
Hi all,
I'm trying to diagnose and resolve an issue with stuttering video playback using the standard AVPlayer. The video in question is a 4K, 39-second file in *.mov format, being played on an iOS device. It's served via a local HTTP server that proxies requests to a backend to fetch and process the content. The project uses end-to-end encrypted storage, which necessitates the proxy for handling data processing. While playback in offline scenarios is smooth, we are encountering issues with smooth playback during streaming. The same video streams smoothly on other platforms using the same connection, so network limitations are not a factor.
On iOS, playback is consistently choppy, with pauses every 1-3 seconds. The video does not appear to buffer adequately for smooth playback.
One particularly curious aspect is the seemingly random pattern of Content-Range requests made by the AVPlayer when streaming the video. Below is an example of the range requests:
Topic:
Media Technologies
SubTopic:
Video
I'd like to find out: Can backgrounded apps record audio?
In the past as I recall, I found that backgrounded apps were pretty restricted and couldn't do much of anything.
However I'm not familiar with the current state of affairs.
With iOS 15.8 and above, can backgrounded apps record audio if they've been given permission by the user to access the microphone?
Thanks.
Topic:
Media Technologies
SubTopic:
Audio
Hi Apple Developer Forums,
I’m developing an iOS camera app that processes RAW captures using Core Image. I’m seeing a large “first use” performance penalty specifically when creating the CIImage from CIRAWFilter.outputImage.
What’s slow (important detail)
I’m measuring the time for:
let rawFilter = CIRAWFilter(imageData: rawData, identifierHint: hint)
let ciImage = rawFilter.outputImage
This is not CIContext.render(...) / createCGImage(...). It’s just the time to access outputImage (i.e., building the Core Image graph / RAW pipeline setup).
Observed behavior
First time accessing CIRAWFilter.outputImage: ~3 seconds
Second time (same app session, similar RAW): ~3 milliseconds
So something heavy is happening only on first use (decoder initialization, pipeline setup, shader/library compilation, caching, etc.).
Using Metal System Trace, I also noticed that during the slow first call there are many “Create MTLLibrary” events, while the second call doesn’t show this pattern.
Warm-up attempts using bundled DNG
I tried to “warm up” early (e.g., on camera screen entry) by loading a bundled DNG and then accessing CIRAWFilter.outputImage by taking a photo:
Warm-up with a ~247 KB DNG → first real RAW outputImage cost drops to ~1.42s
Warm-up with a ~25 MB DNG → first real RAW outputImage cost drops to ~843ms
This helps, but it’s still far from the steady-state ~3ms.
Warm-up by capturing a real RAW (works, but concerns)
The only method that fully eliminates the delay is to trigger a real RAW capture programmatically before the user’s first photo, then use that captured rawData to warm up the CIRAWFilter.outputImage path. This brings the first user-facing capture close to the steady-state timing.
However:
In some regions, the camera shutter sound cannot be suppressed, so “hidden warm-up capture” is unacceptable UX.
I’m also unsure whether triggering a real capture without an explicit user action could raise compliance/privacy concerns, even if the image is immediately discarded and never saved/uploaded.
Questions
Is the large first-time cost of CIRAWFilter.outputImage expected (RAW pipeline initialization / shader compilation)?
Is there an Apple-recommended way to pre-initialize the Core Image RAW pipeline / Metal resources so the first outputImage is fast, without taking a real photo?
Are there any best practices (e.g. CIContext creation timing, prepareRender(...), specific options) that reliably reduce this first-use overhead for CIRAWFilter?
Attachments
Figure 1: First RAW capture with no warm-up (~3s outputImage time)
Figure 2: First RAW capture after warm-up with bundled DNG (improved but still hundreds of ms)
Thanks for any guidance or experience sharing!
I am able to capture 48mp photos using .builtInWideAngleCamera, but it seems like .builtInTripleCamera is capped at 12mp?
Is there a way to capture 48mp photos using .builtInTripleCamera? Because .builtInTripleCamera provides smooth transition between cameras during zooming, and I'd like to keep this behavior.
New iPhone 17 Pro have all their cameras at 48mp. Is there a chance that their .builtInTripleCamera is capable of capturing 48mp? Or is this an API limitation?
I'm working on an app where a user needs to select a video from their Photos library, and I need to get the original, unmodified HEVC (H.265) data stream to preserve its encoding.
The Problem
I have confirmed that my source videos are HEVC. I can record a new video with my iPhone 15 Pro Max camera set to "High Efficiency," export the "Unmodified Original" from Photos on my Mac, and verify that the codec is MPEG-H Part2/HEVC (H.265).
However, when I select that exact same video in my app using PHPickerViewController, the itemProvider does not list public.hevc as an available type identifier. This forces me to fall back to a generic movie type, which results in the system providing me with a transcoded H.264 version of the video.
Here is the debug output from my app after selecting a known HEVC video:
⚠️ 'public.hevc' not found. Falling back to generic movie type (likely H.264).
What I've Tried
My code explicitly checks for the public.hevc identifier in the registeredTypeIdentifiers array. Since it's not found, my HEVC-specific logic is never triggered.
Here is a minimal version of my PHPickerViewControllerDelegate implementation:
import UniformTypeIdentifiers
// ... inside the Coordinator class ...
func picker(_ picker: PHPickerViewController, didFinishPicking results: [PHPickerResult]) {
picker.dismiss(animated: true)
guard let result = results.first else { return }
let itemProvider = result.itemProvider
let hevcIdentifier = "public.hevc"
let identifiers = itemProvider.registeredTypeIdentifiers
print("Available formats from itemProvider: \(identifiers)")
if identifiers.contains(hevcIdentifier) {
print("✅ HEVC format found, requesting raw data...")
itemProvider.loadDataRepresentation(forTypeIdentifier: hevcIdentifier) { (data, error) in
// ... process H.265 data ...
}
} else {
print("⚠️ 'public.hevc' not found. Falling back to generic movie type (likely H.264).")
itemProvider.loadFileRepresentation(forTypeIdentifier: UTType.movie.identifier) { url, error in
// ... process H.264 fallback ...
}
}
}
My Environment
Device: iPhone 15 Pro Max
iOS Version: iOS 18.5
Xcode Version: 16.2
My Questions
Are there specific conditions (e.g., the video being HDR/Dolby Vision, Cinematic, or stored in iCloud) under which PHPickerViewController's itemProvider would intentionally not offer the public.hevc type identifier, even for an HEVC video?
What is the definitive, recommended API sequence to guarantee that I receive the original, unmodified data stream for a video asset, ensuring that no transcoding to H.264 occurs during the process?
Any insight into why public.hevc might be missing from the registeredTypeIdentifiers for a known HEVC asset would be greatly appreciated. Thank you.
I'm using an AVCaptureSession to send video and audio samples to an AVAssetWriter. When I play back the resultant video, sometimes there is a significant lag between the audio compared with the video, so they're just not in sync. But sometimes they are, with the same code.
If I look at the very first presentation time stamps of the buffers being sent to the delegate, via
func captureOutput(_: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)
I see something like this:
Adding audio samples for pts time 227711.0855328798,
Adding video samples for pts time 227710.778785374
That is, the clock for audio vs video is behind: the first audio sample I receive is at 11.08 something, while the video video sample is earlier in time, at 10.778 something. The times are the presentation time stamps of the buffer, and the outputPresentationTimeStamp is the exact same number.
It feels like "video" vs the "audio" clock are just mismatched.
This doesn't always happen: sometimes they're synced. Sometimes they're not.
Any ideas? The device I'm recording is a webcam, on iPadOS, connected via the usb-c port.
I'm using this library: https://github.com/Yummypets/YPImagePicker to capture photos.
I've modified it slightly, and I'm using an older version.
When testing on my iPhone 16e, ios 26, whenever I take a photo, I get the following two error messages:
<<<< FigXPCUtilities >>>> signalled err=-17281 at <>:302
<<<< FigCaptureSourceRemote >>>> Fig assert: "err == 0 " at bail (FigCaptureSourceRemote.m:569) - (err=-17281)
These error messages appear, but as far as I can tell, the photo comes through OK, and I can save the data no problem. I've even removed all my handling code to see if it was something I was doing.
I don't really want to ship with these errors showing, but I also have no idea what can be causing this error to appear. chatgpt was not helpful diagnosing this.
Does anyone know what can cause this error
Is there a way I can see the source code to figure out if there's something I'm doing wrong here?
It really seems like this is an internal apple error, or else I would have expected more details on the error relating to the code I've written. Any clues would be appreciated!
Topic:
Media Technologies
SubTopic:
Photos & Camera
When I use the ColorSync Utility to convert Display P3 color (1, 0, 0) to an XYZ color, the result is (0.5151, 0.2412, -0.0011). I expected that result because that is identical to the red colorant tristimulus value in the Display P3.icc file.When I use the CGColor converted method to do the same, the XYZ color is approximately (0.5151, 0.2412, 0.0). Note that the third element is 0.0 whereas it is -0.0011 when using the ColorSync Utility. I have printed out the Z component to 16 digits of precision, and Z is all 0s. It appears that the CGColor converted method is clamping the result from 0 to 1.My questions are:1. Which conversion is correct? The ColorSync utility or the CGColor converted method?2. I am not a color specialist, but I thought that the XYZ components should never be negative. If so, is the colorant tristimulus value in the Display P3.icc file wrong?3. Because CGColor clamped the Z component to 0, the XYZ color cannot be converted back exactly or closely to the Display P3 color (1, 0, 0). I would have expected to be able to go back and forth between the two color spaces when starting from a valid P3 Display color especially since the XYZ color space completely encompasses the P3 Display color space. Is that not true?4. Is (1, 0, 0) an invalid Display P3 color? If so, I can understand the peculiar results. I'm not sure how I would know if a Display P3 color is valid or not. (I only know that the component values must be from 0 to 1.) I think it is valid because Apple uses that color in the UIColor API Reference in an example.
Hello,
I am getting the following error while attempting to run my LockedCameraCapture compatible app on an iOS 15 device:
dyld[434]: Library not loaded: '/System/Library/Frameworks/LockedCameraCapture.framework/LockedCameraCapture'
Referenced from: '/private/var/containers/Bundle/Application/.../MyApp.app/MyApp.debug.dylib'
Reason: tried: '/System/Library/Frameworks/LockedCameraCapture.framework/LockedCameraCapture' (no such file)
Of course iOS 15 doesn't have the library for LockedCameraCapture, but I have had no issue including Lock Screen Widgets (which require iOS 16), so I am not sure why the error is popping up.
Thank you!
Topic:
Media Technologies
SubTopic:
Photos & Camera
Is there limits on the supported dimension for VTLowLatencyFrameInterpolationConfiguration. Querying VTLowLatencyFrameInterpolationConfiguration.maximumDimensions and VTLowLatencyFrameInterpolationConfiguration.minimumDimensions returns nil. When I try the WWDC sample project EnhancingYourAppWithMachineLearningBasedVideoEffects with a 4k video this statement try frameProcessor.startSession(configuration: configuration) executes but try await frameProcessor.process(parameters: parameters) throws error Error Domain=VTFrameProcessorErrorDomain Code=-19730 "Processor is not initialized" UserInfo={NSLocalizedDescription=Processor is not initialized}.
Also, why is VTLowLatencyFrameInterpolationConfiguration able to run while app is backgrounded but VTFrameRateConversionParameters can't (due to gpu usage)?
quotes are displayed incorrectly in subtitles of AVPlayerViewController when streaming VOD content using HLS.
single quote ' (escaped ') is displayed as apos;
double quotes " (escaped ") is displayed as quot;
following the vtt specification.
The same stream works fine in VLC player, showing quotes correctly in subtitles.
subtitle vtt files use
Content-Type: text/vtt
WEBVTT
X-TIMESTAMP-MAP=LOCAL:490014:06:04.000,MPEGTS:158764568760056
example line:
490014:05:46.000 --> 490014:05:50.440 align:start line:83% position:14%
and the playlist has:
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",LANGUAGE="da",NAME="Dansk",AUTOSELECT=YES,CHARACTERISTICS="public.accessibility.transcribes-spoken-dialog,public.accessibility.describes-music-and-sound",URI="subs/dan_5/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=780000,CODECS="mp4a.40.5,avc1.42c01e",RESOLUTION=256x144,AUDIO="audio-aac",SUBTITLES="subs"
lære dig endnu bedre at kende."
adding 'wvtt' to CODECS list in playlist does not make a difference.
Is this a known bug? Is there a workaround?
I guess the AVResourceLoaderDelegate can be used to intercept and parse the subtitle files, but it seems like quite a hack and not really intended to be used for this.
Hi,
our CourAudio server plugin utilizes the SystemConfiguration.framework to store and restore specific shared system wide settings.
While our application can authenticate to utilize the SystemConfiguration.framework to gain write access to the shared configuration settings the CoreAudio server plugin obviously can't have any user interaction and therefor does not authenticate.
Is it possible to authenticate the CoreAudio server plugin to gain write permissions? Are there any entitlements or other means that would allow this?
Thanks!
Topic:
Media Technologies
SubTopic:
Audio
Tags:
System Configuration
Core Audio
Inter-process communication
Service Management