Hello Apple Developer Community,
I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag.
In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so:
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8"
Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English").
We would like to understand the official or intended behavior regarding this.
Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label?
If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI?
Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute?
Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated.
Thank you for your time and assistance.
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hello,
I’m new here. I'm developing an iOS app and I’d like to know whether it is possible to detect if a phone call is being recorded by another app running in the background.
I’ve already reviewed the documentation for CallKit and AVAudioSession, but I couldn’t find anything related. My expectation was that iOS might provide some callback or API to indicate if a call is being recorded (third-party apps), but so far I haven’t found a way.
My questions are:
Does iOS expose any API to detect if a call is being recorded?
If not, is there any indirect, Apple's policy compliant method (e.g., microphone usage events) that can be relied upon?
Or is this something that iOS explicitly prevents for privacyreasons?
Expecting solutions that align with Apple’s policies and would be accepted under the App Store Review Guidelines.
Thanks in advance for any guidance.
Sequoia 15.4.1 (24E263)
XCode: 16.3 (16E140)
Logic Pro: 11.2.1
I’ve been developing a complex audio unit for Mac OS that works perfectly well in its own bespoke host app and is now well into its beta testing stage.
It did take some effort to get it to work well in Logic Pro however and all was fine and working well until:
The AU part is an empty app extension with a framework containing its code.
The framework contains Swift code for the UI and C code for the DSP parts.
When the framework is compiled using the Swift 5 compiler the AU will run in Logic with no problems.
(I should also mention that AU passes the most strict auval tests).
But… when the framework is compiled with Swift 6 Logic Pro cannot load it.
Logic displays a message saying the audio unit could not be loaded and to contact the developer.
My own host app loads the AU perfectly well with the Swift 6 version, so I know there’s nothing wrong with the audio unit.
I cannot find any differences in any of the built output files except, of course, the actual binary code in the framework.
I’ve worked for hours on this and cannot find a solution other than to build the framework in Swift 5.
(I worked hard to get all the async code updated and working with Swift 6! so I feel a little cheated!)
What is happening?
Is this a bug in Logic?
Is this a bug in Swift 6 compiler/linker?
I’m at the Duh! hands in the air, tearing out hair stage! ( once again!)
Here is the demo from Apple's site
This issues is specific to iOS 18.
When running this demo, we are getting new text when we have a gap in speaking, the recognitionTask(with:resultHandler:) provides new text which is only spoken after the gap and not the concatenation of old text and the new spoken text.
Hi all,
I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect.
After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above.
After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure.
Here is the setupAudioEngine function:
private func setupAudioEngine() {
do {
try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default)
try AVAudioSession.sharedInstance().setActive(true)
} catch {
print("Audio session error: \(error)")
}
engine.attach(playerNode)
engine.attach(analyzer)
engine.connect(playerNode, to: analyzer, format: nil)
engine.connect(analyzer, to: engine.mainMixerNode, format: nil)
analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in
self?.processAudioBuffer(buffer)
}
}
Here is the play function:
func play(_ mediaItem: MPMediaItem) {
guard let assetURL = mediaItem.assetURL else {
print("No asset URL for media item")
return
}
stop()
do {
audioFile = try AVAudioFile(forReading: assetURL)
guard let audioFile else {
print("Failed to create audio file")
return
}
duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate
if !engine.isRunning {
try engine.start()
}
playerNode.scheduleFile(audioFile, at: nil)
playerNode.play()
DispatchQueue.main.async { [weak self] in
self?.isPlaying = true
self?.startDisplayLink()
}
} catch {
print("Error playing audio: \(error)")
DispatchQueue.main.async { [weak self] in
self?.isPlaying = false
self?.stopDisplayLink()
}
}
}
Here is a link to my test project if you want to try it out for yourself:
https://github.com/aabagdi/VisualMan-example
Thanks!
So experimenting with the new SpeechTranscriber, if I do:
let transcriber = SpeechTranscriber(
locale: locale,
transcriptionOptions: [],
reportingOptions: [.volatileResults],
attributeOptions: [.audioTimeRange]
)
only the final result has audio time ranges, not the volatile results.
Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses.
I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while.
The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
InteractionStatistics:
- listeningStarted: 21:24:23 4480 2423
- timeTillFirstAboveNoiseFloor: 01.794
- timeTillLastNoiseAboveFloor: 02.383
- timeTillFirstSpeechDetected: 02.399
- timeTillTranscriptFinalized: 04.510
- timeTillFirstMLModelResponse: 04.938
- timeTillMLModelResponse: 05.379
- timeTillTTSStarted: 04.962
- timeTillTTSFinished: 11.016
- speechLength: 06.054
- timeToResponse: 02.578
- transcript: This is a test.
- mlModelResponse: Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes.
Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
I am having issues deploying my iOS app, that uses ShazamKit, to get working on a Mac with Apple silicon.
When uploading the archive to App Store Connect I do get
ITMS-90863: Macs with Apple silicon support issue - The app links with libraries that aren’t present in macOS:
/usr/lib/swift/libswiftShazamKit.dylib
Is ShazamKit not supported for iOS apps that can run on Macs with Apple silicon? Or is there something I should fix in my setup / deployment?
I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown:
"[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868
(related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022)
If I convert the buffer to AVAudioPCMFormatFloat32 playback works.
My questions are:
Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application?
If 1 is YES is this documented anywhere?
If 1 is YES is this required format subject to change at any point?
Thanks!
I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).
Is there a recommended way on macOS 26 Tahoe to take a CoreAudio AudioObjectID and use it to lookup the underlying USB LocationID?
I previously used AudioObjectID to query the corresponding DeviceUID with kAudioDevicePropertyDeviceUID. Then I queried for the IOService matching kIOAudioEngineClassName with property kIOAudioEngineGlobalUniqueIDKey matching DeviceUID, and I loaded kUSBDevicePropertyLocationID from the result.
This fails on macOS 26, because the IO Registry for the device has an entry for usbaudiod rather than AppleUSBAudioEngine, and usbaudiod does not include a kIOAudioEngineGlobalUniqueIDKey property (or any other property to map it to a CoreAudio DeviceUID).
My use-case here is a piece of audio recording software that allows configuring a set of supported audio devices via USB HID prior to recording. I present the user with a list of CoreAudio devices to use, but without a way to lookup the underlying USB LocationID, I cannot guarantee that the configured device matches the selected device (e.g. if the user plugged in two identical microphones).
I'm trying to implement airplay into my app. I can successfully playback sound and trigger the airplay selector sheet. If the target device is a Bluetooth only device I can connect with no problem and stream the audio to the Bluetooth device, but if the audio device is a airplay specific device like a HomePod or an Apple TV when I select it, I get a spinning icon, indicating that it is trying to connect, and eventually it times out and stops without connecting.
I don't believe it is an AirPlay audio issue because if I go to a different app, for example a podcast app and select my HomePods for output, and then switch back to my app. My audio will correctly stream to the HomePod. Not only that, I have it so that my icon will change color to indicate that it is connected via airplay and it is correctly indicating that it is connected via AirPlay. But I cannot then disconnect it using the Airplay selector.
The issue appears to be in the AirPlay selection side, which I have spent several days attempting to troubleshoot mostly using ChatGPT to suggest code different than what I have to maybe work around the issue. Mostly it is focused on the audio player section, but it doesn't seem like that is really the route that is the problem.
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right?
It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"?
Thank you.
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages.
When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error:
Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download."
The ".ar" appears to be the language code, which in this case was Arabic.
When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here.
For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around.
Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself.
private func setUpAnalyzer() async throws {
guard let sourceLanguage else {
throw Error.languageNotSpecified
}
guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else {
throw Error.unsupportedLanguage
}
let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription)
self.transcriber = transcriber
let reservedLocales = await AssetInventory.reservedLocales
if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales {
if let oldest = reservedLocales.last {
await AssetInventory.release(reservedLocale: oldest)
}
}
do {
let status = await AssetInventory.status(forModules: [transcriber])
print("status: \(status)")
if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) {
try await installationRequest.downloadAndInstall()
}
}
...
I have a SwiftUI app - (https://youtu.be/VbAfUk_eYl0?si=JxUBh0Bpb-vc1E1U) - which I thought was almost ready for release - a manager for airdropped audio files from Logic Pro or other music creation applications. It uses AVAudioEngine and AVAudioPlayerNode to play audio, and the MediaPlayer API to integrate with car audio and similar, all of which works well.
It does not currently have an explicit CarPlay integration (and I'm slightly horrified at the amount of work that is going to require).
I had the good or bad luck of getting a loaner car with carplay while mine is being repaired yesterday, and lo and behold, when connected to the vehicle via CarPlay, there is no audio output in the vehicle at all. The now playing panel correctly shows the information my app provides about the currently playing song; the player node believes it is playing, the AVAudioSession is configured as it should be. But there is no sound.
Obviously I cannot ship it in this state.
I've tried fiddling with the parameters the AVAudioSession is configured with, in case there was some parameter that was preventing audio output, to no avail - currently:
var options = AVAudioSession.CategoryOptions()
options.insert(.allowAirPlay)
options.insert(.allowBluetooth)
options.insert(.allowBluetoothA2DP)
try session.setCategory(.playback, mode: .default, options: options)
try? session.setPreferredIOBufferDuration(0.002) // ~96 samples at 44.1kHz
try? session.setPrefersNoInterruptionsFromSystemAlerts(true)
try? session.setPrefersInterruptionOnRouteDisconnect(false)
try session.setActive(true, options: [.notifyOthersOnDeactivation])
All diagnostics within the app show the player operating correctly - files are played and flushed; AVAudioPlayerNodeCompletionCallbacks are called when they should be. But the output is not audible in the vehicle.
I would much prefer to ship this app without full-blown CarPlay integration, but with working audio when connected via CarPlay, and work on full CarPlay integration for the next release.
Is there some secret handshake I am just missing to make this work?
Hi,
I have just implemented an Audio Unit v3 host.
AgsAudioUnitPlugin *audio_unit_plugin;
AVAudioUnitComponentManager *audio_unit_component_manager;
NSArray<AVAudioUnitComponent *> *av_component_arr;
AudioComponentDescription description;
guint i, i_stop;
if(!AGS_AUDIO_UNIT_MANAGER(audio_unit_manager)){
return;
}
audio_unit_component_manager = [AVAudioUnitComponentManager sharedAudioUnitComponentManager];
/* effects */
description = (AudioComponentDescription) {0,};
description.componentType = kAudioUnitType_Effect;
av_component_arr = [audio_unit_component_manager componentsMatchingDescription:description];
i_stop = [av_component_arr count];
for(i = 0; i < i_stop; i++){
ags_audio_unit_manager_load_component(audio_unit_manager,
(gpointer) av_component_arr[i]);
}
/* instruments */
description = (AudioComponentDescription) {0,};
description.componentType = kAudioUnitType_MusicDevice;
av_component_arr = [audio_unit_component_manager componentsMatchingDescription:description];
i_stop = [av_component_arr count];
for(i = 0; i < i_stop; i++){
ags_audio_unit_manager_load_component(audio_unit_manager,
(gpointer) av_component_arr[i]);
}
But this doesn't show me Audio Unit v2 plugins, why?
regards, Joël
Hi,
I am getting into a trap. Please check stack-trace, howto fix this?
regards, Joël
stack-trace with ExtAudioFileWrite
Hi,
I am trying to remove the audio controls for my app on the lock screen. Since I use WKWebView, there are 3 audio tags in my html and I play and pause em via JS. However, if I do not play any sound since app launch, there are no audio controls on the lock screen. But if I play one of those 3 files (they are even less then 3 Sec sound effects e.g. for buttons) the audio controls appears on lock screen.
Note even when the sounds on pause() or not playing they were listed on the lock screen.
What I have tried so far without success
MPNowPlayingInfoCenter.default().nowPlayingInfo = [:]
and
``try audioSession.setCategory(.playback, mode: .default, options: [])
try audioSession.setActive(false, options: .notifyOthersOnDeactivation)``
and
UIApplication.shared.endReceivingRemoteControlEvents()
Another problem is that the app scales with iOS system settings "display zoom". Is there a way to deny it?
It is latest Xcode verion 16.3 and iOS 18.
I have no background mode in my Capabilities.
Nothing worked so far. Has anyone an idea?
Greetings
I've got a problem with my app where I'm testing it on my own phone.
I'm using audio kit to generate tones as part of the app. Everything seems to work fine. Sounds start, Stop, etc. They play when the app is closed and when the phone is locked, so background is working.
However, I'm seeing an issue where, even when STOP is pressed and the application exited, if I get a notification such as a text message, the base tone for the app starts to play.
If I then open the app, check the Start/Stop button - it says start so that. hasnt' been activated. If I click Start, then a 2nd tone starts. This one stops with the Stop button. However the original tone that was set off by an incoming message carries on playing.
Until I go to the Open Apps View on the phone and slide the application upwards.
For the life of me, I can't figure out whats happening here.
Since many users like me use Apple Music on Android, the app is almost as feature-rich as iOS. It would be fantastic if the developers could add the new iOS 26 features to the Android app, along with a minor UI change. I know it’s challenging to implement liquid glass on Android hardware or design, but features like auto-mix, pronunciation, and translation could be added.
kindly consider this request !!!!