7.3 KiB
7.3 KiB
Screen Capture Implementation Progress
Overview
Implementation of screen capture functionality for SCAR Chat with support for multiple backends:
- Wayland/Hyprland: xdg-desktop-portal + Pipewire
- X11: FFmpeg with x11grab
- Windows: FFmpeg with GDI
Current Status: 🟡 In Progress
Architecture
Backend Detection
- ✅ Auto-detection via environment variables (WAYLAND_DISPLAY, HYPRLAND_INSTANCE_SIGNATURE)
- ✅ Fallback mechanism (Portal → X11 → Windows)
- ✅ Manual backend selection support
Wayland/Hyprland Implementation (Priority)
Status: 🟡 In Progress - CORRECTED TO USE HYPRLAND PORTAL
Critical Architecture Understanding:
- xdg-desktop-portal-hyprland implements org.freedesktop.portal.ScreenCast (standard API)
- We use sdbus-c++ library (NOT libdbus-1) to communicate with the portal
- The portal handles Hyprland-specific details internally (via hyprland-share-picker)
- From client perspective: call standard portal API → portal shows hyprland-share-picker → get Pipewire stream
Dependencies:
- sdbus-c++ (required for DBus communication) ✅ Installed
- libpipewire-0.3 (for stream handling) ✅ Installed
- xdg-desktop-portal-hyprland (runtime requirement - provides hyprland-share-picker)
- spa-utils (for spa_hook structure)
Implementation Tasks:
- Switch from libdbus-1 to sdbus-c++
- Use standard org.freedesktop.portal.Desktop.ScreenCast interface
- Update CMakeLists.txt with pkg-config for sdbus-c++ and pipewire
- Screen capture class structure with backend selection
- initPortalConnection() - sdbus session bus connection
- cleanupPortalConnection() - Resource cleanup
- createPortalSession() - Use sdbus-c++ for CreateSession method with unique handles
- selectPortalSources() - Use sdbus-c++ for SelectSources (portal shows hyprland-share-picker)
- startPortalSession() - Use sdbus-c++ for Start method
- openPipeWireRemote() - Get file descriptor from portal using UnixFd
- getStreamsNodeId() - Query session Streams property to get actual node_id
- initPipewire() - Complete implementation with stream connection, listeners, and thread loop
- onStreamProcess() - Frame callback implementation that dequeues buffers and invokes user callback
- onStreamParamChanged() - Handle resolution/format changes and update dimensions
- cleanupPipewire() - Stop thread loop and cleanup resources properly
- Test end-to-end screen capture flow
- Frame buffer memory management optimization
- Error handling and session recovery
- Restore token support for session persistence
Notes:
- Standard portal API is service name:
org.freedesktop.portal.Desktop - Object path:
/org/freedesktop/portal/desktop - Interface:
org.freedesktop.portal.ScreenCast - When SelectSources is called, xdph automatically launches hyprland-share-picker GUI
- User selection is handled transparently - we just get back session handle + Pipewire node
Technical Details:
// XDG Desktop Portal ScreenCast API Workflow:
// 1. org.freedesktop.portal.ScreenCast.CreateSession(options) -> session_handle
// - Creates session object for this screen cast
//
// 2. org.freedesktop.portal.ScreenCast.SelectSources(session_handle, options)
// - options.types: MONITOR(1), WINDOW(2), VIRTUAL(4)
// - options.multiple: allow selecting multiple sources
// - options.cursor_mode: Hidden(1), Embedded(2), Metadata(4)
// - options.persist_mode: DoNotPersist(0), WhileRunning(1), UntilRevoked(2)
//
// 3. org.freedesktop.portal.ScreenCast.Start(session_handle, parent_window, options)
// - User selects screen/window via portal UI
// - Response includes: streams array with [(node_id, properties)]
// - Each stream has: id, position, size, source_type
// - Returns restore_token for future sessions
//
// 4. org.freedesktop.portal.ScreenCast.OpenPipeWireRemote(session_handle) -> fd
// - Returns file descriptor for PipeWire connection
//
// 5. Pipewire Connection:
// - pw_context_connect_fd(fd) creates pw_core
// - pw_stream_new() with node_id from Step 3
// - pw_stream_add_listener() for frame callbacks
// - pw_stream_connect() to start streaming
//
// 6. Frame Processing:
// - on_process() callback receives spa_buffer with frame data
// - Extract video/raw format (RGB, YUV, etc.)
// - Invoke FrameCallback with decoded data
X11 Implementation (Fallback)
Status: 🔴 Not Started
Dependencies:
- FFmpeg (libavformat, libavcodec, libavutil, libavdevice)
- X11 libraries
Implementation Tasks:
- FFmpeg context initialization
- x11grab input device configuration
- Frame extraction and decoding
- Frame callback integration
- Display selection (multi-monitor support)
Windows Implementation (Future)
Status: 🔴 Not Started
Dependencies:
- FFmpeg with GDI support
Implementation Tasks:
- FFmpeg GDI grabber setup
- Frame processing pipeline
- Display enumeration
Testing Plan
Unit Tests
- Backend detection on different environments
- Frame callback invocation
- Start/stop lifecycle
- Memory leak verification
Integration Tests
- Wayland/Hyprland capture on real desktop
- X11 capture verification
- Multi-monitor scenarios
- Permission denial handling
Performance Tests
- Frame rate consistency (target: 30 FPS)
- CPU usage profiling
- Memory usage under continuous capture
Known Issues & Limitations
Current
- All backends are stubs (no actual implementation)
- No frame encoding/compression
- No multi-monitor selection UI
Future Considerations
- Portal permissions may require user interaction each session
- Hyprland-specific optimizations possible via hyprland-share-picker
- Frame rate limiting needed to prevent CPU overload
- Consider hardware encoding for lower CPU usage
Code Locations
- Header:
client/media/screen_capture.h - Implementation:
client/media/screen_capture.cpp - Dependencies:
CMakeLists.txt(client section)
Next Steps
- Implement Pipewire + Portal screen capture for Wayland/Hyprland
- Test on Hyprland environment
- Implement X11 fallback
- Add frame encoding for network transmission
- Integrate with video streaming protocol
Session Log
Session 1 - December 7, 2025
- Completed: Authentication system fully working (plaintext → salt → argon2 verification)
- Fixed: Message deserialization bug (async buffer capture issue in both client and server)
- Status: Ready to begin screen capture implementation
- Decision: Prioritize Wayland/Hyprland implementation due to target environment
Session 2 - December 7, 2025 (Current)
- Researched: xdg-desktop-portal-hyprland specifications and org.freedesktop.portal.ScreenCast API
- Implemented:
- Screen capture header with forward declarations for DBus/Pipewire types
- Basic structure with backend detection and selection
- DBus initialization and cleanup functions
- Pipewire initialization skeleton (loop, context creation)
- Platform-specific compilation (#ifdef linux)
- startPortalCapture() workflow outline (6-step process)
- TODO: Implement actual DBus method calls for portal communication
- Next: Implement createPortalSession() with proper DBus message building