Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.

All subtopics
Posts under Spatial Computing topic

Post

Replies

Boosts

Views

Activity

A Summary of the WWDC25 Group Lab - visionOS
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for visionOS. I saw that there is a new way to add SwiftUI View attachments in my RealityView, what advantages does this have over the old way? Attachments can now be added directly to your entities with ViewAttachmentComponent. The removes the need to declare your attachments upfront in your RealityView initializer and then add those attachments as child entities. The new approach provides greater flexibility. Canyon Crosser and Petite Asteroids both utilize the new approach. ManipulationComponent looks really cool! Right now my app has a series of complicated custom gestures. What gestures does it handle for me exactly, and are there any situations where I should prefer my own custom gestures? ManipulationComponent provides natural interaction with virtual objects. It seamlessly handles translation and rotation. You can easily add manipulation to a SwiftUI view like Model3D with the manipulable view modifier. The new Object Manipulation API is great for most apps, and is a breeze to implement, but sometimes you might want a more custom feel, and that’s ok! Custom gestures are still fully supported for that scenario. I saw that there is a new API to also access the right main camera. What can I do with this? Correct, in visionOS 26, you can access the left and right main cameras. You can even access them simultaneously as a stereo pair. Camera access still requires a managed entitlement and an enterprise license, see Accessing the main camera for more details about those requirements. More computer vision and machine learning use-cases are unlocked with access to both cameras, we are excited to see what you will do! What do I need to do to add spatial accessory input for my app? First, use the GameController framework to establish a connection with the spatial accessory, and then listen for events from the controller. Then, you can use either RealityKit, ARKit, or a combination of both to track the accessory, anchor virtual content to it, and fine tune the accessory interaction with the content in your app. For more details, check out Discovering and tracking spatial game controllers and styli. By far, the most difficulty with implementing visionOS apps is SwiftUI window management…placing, opening, closing, etc. Are there any improvements to window management in visionOS 26? Yes! We recommend watching Set the scene with SwiftUI in visionOS. You can use the defaultLaunchBehavior to choose whether a particular window is presented (or suppressed) at launch. You can also prevent a window like a secondary toolbar from launching as the initial window using .restorationBehavior(.disabled). Adopting best practices for persistent UI provides a great overview of SwiftUI window management on visionOS. As for placing windows, there is still no API for an app to specify the placement of its windows other than relative placement. If that is a feature you are interested in, please file an enhancement request for it using Feedback Assistant! How to get access to the Enterprise API? First, request the entitlement and license through your Apple Developer or enterprise account. Once these have been granted, include the license and entitlement in your project. Then you can build, test, and distribute as an in-house app.
4
0
332
Jul ’25
How to read a video stream that includes both the physical world and digital space when obtaining a video stream from VisionPro by applying for the Enterprise API but only reading a video stream that contains both the physical world and digital space
We have successfully obtained the permissions for "Main Camera access" and "Passthrough in screen capture" from Apple. Currently, the video streams we have received are from the physical world and do not include the digital world. How can we obtain video streams from both the physical and digital worlds? thank you!
0
0
111
Apr ’25
I need to troubleshoot Transform Drift in ARKit
Hi all, I'm currently developing a real-time object reconstruction app using ARKit. The goal is to scan large objects using ARKit’s depth and transform data, and generate a point cloud. However, I’m facing a major challenge - Transform Drift / World Alignment Issues The localToWorld transform provided by ARKit frequently seems to drift or become unstable across frames. This results in misaligned point clouds even when the device is moved slowly or kept relatively still. In some cases, a static surface scanned over a few seconds results in clearly misaligned fragments. This makes it difficult to accurately stitch a multi-frame point cloud. I have experimented with various lighting conditions and object textures, but the issue persists in all cases. At times, the relative error between frames reaches up to 20 cm, while in other instances the error is minimal; however, the drift gradually accumulates over time, leading to an overall enlargement of the reconstructed object. I have attached images of both cases here. Questions: Are there specific conditions under which ARKit’s world transform is expected to drift? Is there a way to detect or recover from this drift during runtime? Any best practices for maintaining consistent tracking during scanning or measurement sessions?
1
0
137
Jun ’25
I am developing a Immersive Video App for VisionOs but I got a issue regarding app and video player window
In Vision OS app, We have two types of windows: Main App Window – This is the default window that launches when the app starts. It displays the video listings and other primary content. Immersive Space Window – This opens only when a user starts streaming or playing a video. Issue: When entering the immersive space, the main app window remains visible in front of it unless manually closed. To avoid this, I currently close the main window when transitioning to immersive space and reopen it when exiting. However, this causes the app to restart instead of resuming from its previous state. Desired Behavior: I want the main app window to retain its state and seamlessly resume from where it was before entering immersive mode, rather than restarting. Attempts & Challenges: Tried managing opacity, visibility, and state preservation, but none worked as expected. Couldn’t find a way to push the main window to the background while bringing the immersive space to the foreground. Looking for a solution to keep the main window’s state intact while transitioning between immersive and normal modes.
1
0
170
Mar ’25
MagnifyGesture in RealityView does not work on macOS
I have tested the MagnifyGesture code below on multiple devices: Vision Pro - working iPhone - working iPad - working macOS - not working In Reality Composer Pro, I have also added the below components to the test model entity: Input Target Collision For macOS, I tried the touchpad pinch gesture and mouse scroll wheel, but neither approach works. How to resolve this issue? Thank you. import SwiftUI import RealityKit import RealityKitContent struct ContentView: View { var body: some View { RealityView { content in // Add the initial RealityKit content if let immersiveContentEntity = try? await Entity(named: "Immersive", in: realityKitContentBundle) { content.add(immersiveContentEntity) } } .gesture(MagnifyGesture() .targetedToAnyEntity() .onChanged(onMagnifyChanged) .onEnded(onMagnifyEnded)) } func onMagnifyChanged(_ value: EntityTargetValue<MagnifyGesture.Value>) { print("onMagnifyChanged") } func onMagnifyEnded(_ value: EntityTargetValue<MagnifyGesture.Value>) { print("onMagnifyEnded") } }
0
0
71
Mar ’25
Request: More Fine-Grained Control in Object Capture (PhotogrammetrySession)
Hi Apple Team and Developers, First of all, I’d like to express my appreciation for the incredible results achieved using PhotogrammetrySession. I’ve been developing a portrait scanning app using Object Capture, and in many tests—especially with human models—I’ve found the reconstructed body surfaces are remarkably smooth and clean, often outperforming tools like Metashape and RealityCapture in terms of aesthetic results. However, I’ve encountered some challenges when working with complex areas like long hair overlapping the face. For instance, with female models where strands of hair partially occlude the face, the resulting mesh tends to merge the hair and facial geometry. This leads to distorted or “melted” facial features, likely due to ambiguity in the geometry estimation phase. Feature Suggestion: Would it be possible to allow developers to supply two versions of the input images: • One version (original) for texture generation • A pre-processed version (e.g., contrast-enhanced or CLAHE filtered) to guide mesh reconstruction only This would give us the flexibility to enhance edge features or shadow detail without affecting the final texture appearance. In other photogrammetry pipelines, applying image enhancement selectively before dense reconstruction improves geometry quality in low-contrast areas. Question: Is there any plan to support this kind of two-path workflow in future versions of PhotogrammetrySession? Or perhaps expose more intermediate stages or tunable parameters to developers? Also, any hints on what we can expect from WWDC 2025 regarding improvements to Object Capture or related vision/3D technologies? Thanks again for this powerful API. Looking forward to hearing insights from the team and other developers. Warm regards, KitCheng
0
0
100
May ’25
ARView environment.lighting IBL from HDR file
I have an iOS app that can display a USDZ model downloaded from the Internet (and cached locally) via an ARView. I would like to light that model with an image based light (IBL) also downloaded from the Internet. However, as far as I can tell, ARView can only create an IBL from a resource that has been compiled into the Xcode project and loaded with EnvironmentResource(named:in:) or EnvironmentResource.load(named:in:). Is there a way to create an EnvironmentResource from an HDRI via a file URL to use in ARView in iOS?
0
0
734
Nov ’25
RealityView camera feed not shown
I have two RealityView: ParentView and When click the button in ParentView, ChildView will be shown as full screen cover, but the camera feed in ChildView will not be shown, only black screen. If I show ChildView directly, it works with camera feed. Please help me on this issue? Thanks. import RealityKit import SwiftUI struct ParentView: View{ @State private var showIt = false var body: some View{ ZStack{ RealityView{content in content.camera = .virtual let box = ModelEntity(mesh: MeshResource.generateSphere(radius: 0.2),materials: [createSimpleMaterial(color: .red)]) content.add(box) } Button("Click here"){ showIt = true } } .fullScreenCover(isPresented: $showIt){ ChildView() .overlay( Button("Close"){ showIt = false }.padding(20), alignment: .bottomLeading ) } .ignoresSafeArea(.all) } } import ARKit import RealityKit import SwiftUI struct ChildView: View{ var body: some View{ RealityView{content in content.camera = .spatialTracking } } }
0
0
172
Jun ’25
ARKit / visionOS - handtracking with 3D objects attached on hand
I use ARKit's hand tracking to attach a 3D model of a remote control to the left hand. The user is supposed to press buttons on the remote control. In the Vision Pro settings, I have removed the left hand from Hands & Eye Tracking. Only the right hand is used. The problem now is that the left hand appears and the 3D model of the remote control fades out. I want the remote control to be completely visible. The user should feel like they really have the remote control in their hand. Can I prevent the fading out?
1
0
255
Nov ’25
visionOS 3d interactions like the native keyboard when no longer observed in passthrough
While using apple's vision pro, we noticed that we can continue to use the visionOS keyboard when we no longer actually see it in passthrough. In other words, when we focus on a field to type, visionOS displays the keyboard for us in such a way that we actually see it. Then, we noticed if we look away a little bit, either up, or down, or left, or right, in such a way that the keyboard is no longer visible by us in the passthrough, the keyboard still remains responsive to taps from our fingers at the location where it is. It seems the keyboard remains functional and responsive to taps even though we can no longer observe/see it. We are trying to figure out how to implement similar functionality in our app whereby the user can continue to manipulate a 3d entity when the user can no longer actually observe it in passthrough (like the visionOS keyboard appears to allow). I assume the visionOS keyboard has this functionality thanks to the downward facing sensors on the hardware that allow hand tracking even though the hands can no longer be observed by the user. That is likely how we can rest our hands on our lap is still be able to interact with visionOS. How can we implement a similar functionality for 3D entities? Is there a way to tap in, or to allow hand tracking, from those toward facing cameras? Is it possible to manipulate a 3D entity when it is no longer observed by the user for example when they shift their attention somewhere else in the field of vision? How does the visionOS keyboard achieve this?
1
0
331
Nov ’25
For a third year, no screenshot capability for immersive visionOS apps... here's a workaround?
Since only the user can take a screenshot using the Apple Vision Pro's top buttons, the only workaround available to an immersive app that needs a screenshot to document the user's creative interior design choices is ask the user to take a screenshot wait until the user taps a button indicating the screenshot has been taken then the app asks the user to select the screenshot when the app opens the PhotoPicker when the user presses Done, the screenshot is handed off to the app. One wonders why there is no Apple Api for doing this in a simple privacy protective way such as: When called, the Apple api captures the screenshot in Apple secured memory The api displays the screenshot to the user with appropriate privacy warnings and asks if the user wants to a. share this screenshot with the app, or b. cancel, c. retake the screenshot If the user approves, the app receives the screenshot
3
0
116
Jun ’25
mipmapsMode trade-off?
I am building a 360 photo viewer in VisionOS 26. Which allows the user to choose a 2 by 1 jpg and then renders it with a sphere mesh entity. And I use: TextureResource(contentsOf: url, options: options). I noticed two situations here in terms of mipmaps options. When setting "mipmapsMode: .none": The graphic quality within the "gaze area" looks sharp and clear The two poles (top and bottom) are perfectly rendered Massive shimmer around the "gaze area" When setting "mipmapsMode: .allocateAndGenerateAll": The graphic looks slightly blurrier than in ".none" within the "gaze area" The two poles are very blurry and hard to recognize the texture Much less shimmer around the "gaze area" My question would be: Is there a way to have the perfect graphic quality in ".none" without the massive shimmer? Thank you! Screenshots: mipmapsMode: .none mipmapsMode: .allocateAndGenerateAll
0
0
232
Oct ’25
version update in Vision Pro
Hi, I'm developing an app for Vision Pro using Xcode, while updating the latest update, things that worked in my app suddenly didn't. in my app flow I'm tapping spheres to get their positions, from some reason I get an offset from where I tap to where a marker on that position is showing up. here's the part of code that does that, and a part that is responsible for an alignment that happens afterwards: func loadMainScene(at position: SIMD3) async { guard let content = self.content else { return } do { let rootEntity = try await Entity(named: "surgery 16.09", in: realityKitContentBundle) rootEntity.scale = SIMD3<Float>(repeating: 0.5) rootEntity.generateCollisionShapes(recursive: true) self.modelRootEntity = rootEntity let bounds = rootEntity.visualBounds(relativeTo: nil) print("📏 Model bounds: center=\(bounds.center), extents=\(bounds.extents)") let pivotEntity = Entity() pivotEntity.addChild(rootEntity) self.pivotEntity = pivotEntity let modelAnchor = AnchorEntity(world: [1, 1.3, -0.8]) modelAnchor.addChild(pivotEntity) content.add(modelAnchor) updateModelOpacity(0.5) self.modelAnchor = modelAnchor rootEntity.visit { entity in print("👀 Entity in model: \(entity.name)") if entity.name.lowercased().hasPrefix("focus") { entity.generateCollisionShapes(recursive: true) entity.components.set(InputTargetComponent()) print("🎯 Made tappable: \(entity.name)") } } print("✅ Model loaded with collisions") guard let sphere = placementSphere else { return } let sphereWorldXform = sphere.transformMatrix(relativeTo: nil) var newXform = sphereWorldXform newXform.columns.3.y += 0.1 // move up by 20 cm let gridAnchor = AnchorEntity(world: newXform) self.gridAnchor = gridAnchor content.add(gridAnchor) let baseScene = try await Entity(named: "Scene", in: realityKitContentBundle) let gridSizeX = 18 let gridSizeY = 10 let gridSizeZ = 10 let spacing: Float = 0.05 let startX: Float = -Float(gridSizeX - 1) * spacing * 0.5 + 0.3 let startY: Float = -Float(gridSizeY - 1) * spacing * 0.5 - 0.1 let startZ: Float = -Float(gridSizeZ - 1) * spacing * 0.5 for i in 0..<gridSizeX { for j in 0..<gridSizeY { for k in 0..<gridSizeZ { if j < 2 || j > gridSizeY - 5 { continue } // remove 2 bottom, 4 top let cell = baseScene.clone(recursive: true) cell.name = "Sphere" cell.scale = .one * 0.02 cell.position = [ startX + Float(i) * spacing, startY + Float(j) * spacing, startZ + Float(k) * spacing ] cell.generateCollisionShapes(recursive: true) gridCells.append(cell) gridAnchor.addChild(cell) } } } content.add(gridAnchor) print("✅ Grid added") } catch { print("❌ Failed to load: \(error)") } } private func handleModelOrGridTap(_ tappedEntity: Entity) { guard let modelRootEntity = modelRootEntity else { return } let localPosition = tappedEntity.position(relativeTo: modelRootEntity) let worldPosition = tappedEntity.position(relativeTo: nil) switch tapStep { case 0: modelPointA = localPosition modelAnchor?.addChild(createMarker(at: worldPosition, color: [1, 0, 0])) print("📍 Model point A: \(localPosition)") tapStep += 1 case 1: modelPointB = localPosition modelAnchor?.addChild(createMarker(at: worldPosition, color: [1, 0.5, 0])) print("📍 Model point B: \(localPosition)") tapStep += 1 case 2: targetPointA = worldPosition targetMarkerA = createMarker(at: worldPosition,color: [0, 1, 0]) modelAnchor?.addChild(targetMarkerA!) print("✅ Target point A: \(worldPosition)") tapStep += 1 case 3: targetPointB = worldPosition targetMarkerB = createMarker(at: worldPosition,color: [0, 0, 1]) modelAnchor?.addChild(targetMarkerB!) print("✅ Target point B: \(worldPosition)") alignmentReady = true tapStep += 1 default: print("⚠️ Unexpected tap on model helper at step \(tapStep)") } } func alignModel2Points() { guard let modelPointA = modelPointA, let modelPointB = modelPointB, let targetPointA = targetPointA, let targetPointB = targetPointB, let modelRootEntity = modelRootEntity, let pivotEntity = pivotEntity, let modelAnchor = modelAnchor else { print("❌ Missing data for alignment") return } let modelVec = modelPointB - modelPointA let targetVec = targetPointB - targetPointA let modelLength = length(modelVec) let targetLength = length(targetVec) let scale = targetLength / modelLength let modelDir = normalize(modelVec) let targetDir = normalize(targetVec) var axis = cross(modelDir, targetDir) let axisLength = length(axis) var rotation = simd_quatf() if axisLength < 1e-6 { if dot(modelDir, targetDir) > 0 { rotation = simd_quatf(angle: 0, axis: [0,1,0]) } else { let up: SIMD3<Float> = [0,1,0] axis = cross(modelDir, up) if length(axis) < 1e-6 { axis = cross(modelDir, [1,0,0]) } rotation = simd_quatf(angle: .pi, axis: normalize(axis)) } } else { let dotProduct = dot(modelDir, targetDir) let clampedDot = max(-1.0, min(dotProduct, 1.0)) let angle = acos(clampedDot) rotation = simd_quatf(angle: angle, axis: normalize(axis)) } modelRootEntity.scale = .one * scale modelRootEntity.orientation = rotation let transformedPointA = rotation.act(modelPointA * scale) pivotEntity.position = -transformedPointA modelAnchor.position = targetPointA alignedModelPosition = modelAnchor.position print("✅ Aligned with scale \(scale), rotation \(rotation)")
2
0
322
Oct ’25
RoomPlan CaptureError.exceedSceneSizeLimit on iOS devices
When scanning multiple rooms (10+) in a single structure using ARWorldMap for coordinate space consistency, RoomCaptureSession throws CaptureError.exceedSceneSizeLimit. The instructions here (https://developer.apple.com/documentation/roomplan/scanning-the-rooms-of-a-single-structure) provide exactly what I am doing to keep the underlying ARSession alive (by calling captureSession.stop(pause: false)) and save the results before a user moves to the next room. Scanning 11 or so rooms will cause the user to hit the exceedSceneSizeLimit error. The ARWorldMap is about 58 MB and always is around this size when hitting this issue. No anchors are present and all the data seems to be from tracking data. On iPad devices (where I do not see this issue) the ARWorldMap grows as a significantly slower rate in size. I save the ARWorldMap after each room is scanned and confirmed by the user. If I use the ARMap to initialize the ARSession (as described in the docs) the session will immediately error with "exceedSceneSizeLimit" once the captureSession.run() is executed. Occasionally it will allow me/the user to scan again, but either breaks mid scan or the following. This has been working fine for the past 2 years and users have been able to scan dozens of rooms without issue. It seems only lately that it has been a problem. I would expect the ARWorldMap to be allowed for much bigger sizes. At this point I can just about scan more area of my house with a single scan than I can when I use different captureSessions. Few observations: This happens on my iPhone 15 Pro Max, my iPhone 17 Pro, but not my iPad M4 (maybe memory related?). It is possible if scanning many more rooms it would happen on the iPad too. I have tried things such as resetting the ARConfig on the underlying ARSession to reset some, but this doesn't work. I have tried to create a new ARWorldMap and move the origin to the older map to clear out tracking data. This almost works but causes a mess of issues when a user moves at all due to the unshared coordinate space. I believe there are three active issues regarding this: FB14454922, FB15035788, FB20642944 Could we get an update for this issue? It is a production issue and severely limits my user experience in my production application.
0
0
147
Oct ’25
Depth matrix accuracy with the iPhone 14 Pro and Lidar
Hello Community, I’m currently working with the sample code “CapturingDepthUsingTheLiDARCamera” and using it to capture the depth map of an image taken with the iPhone 14 Pro. From this depth map, I generate a point cloud using the intrinsic camera parameters. I've noticed that objects not facing the camera directly appear distorted in the resulting point cloud. For example: An object with surfaces that are perpendicular to each other appears with a sharper angle in the point cloud — around 60° instead of 90°. My question is: Is this due to the general accuracy limitations of the LiDAR sensor? Or could it be related to the sample code? To obtain the depth map, I’m using: AVCapturePhoto.depthData.converting(toDepthDataType: kCVPixelFormatType_DepthFloat32) Thanks in advance for your help!
0
0
128
Apr ’25
Safari-like toolbar in visionOS
I like the toolbar visionOS's Safari uses for back & forward page, share, etc. It floats above the window. My attempt to do this with ornaments isn't as satisfying as they partially cover the window. My attempts with toolbar haven't produced visible results. Is this Safari-style toolbar for visionOS exposed by Apple in the API's? If so, could someone point me to documentation or sample code? Thanks!
1
0
244
Oct ’25