scar-chat7/PROGRESS-SCREENCAPTURE.md

7.8 KiB

Screen Capture Implementation Progress

Overview

Implementation of screen capture functionality for SCAR Chat with support for multiple backends:

  • Wayland/Hyprland: xdg-desktop-portal + Pipewire
  • X11: FFmpeg with x11grab
  • Windows: FFmpeg with GDI

Current Status: 🟡 In Progress


Architecture

Backend Detection

  • Auto-detection via environment variables (WAYLAND_DISPLAY, HYPRLAND_INSTANCE_SIGNATURE)
  • Fallback mechanism (Portal → X11 → Windows)
  • Manual backend selection support

Wayland/Hyprland Implementation (Priority)

Status: 🟡 In Progress - CORRECTED TO USE HYPRLAND PORTAL

Critical Architecture Understanding:

  • xdg-desktop-portal-hyprland implements org.freedesktop.portal.ScreenCast (standard API)
  • We use sdbus-c++ library (NOT libdbus-1) to communicate with the portal
  • The portal handles Hyprland-specific details internally (via hyprland-share-picker)
  • From client perspective: call standard portal API → portal shows hyprland-share-picker → get Pipewire stream

Dependencies:

  • sdbus-c++ (required for DBus communication) Installed
  • libpipewire-0.3 (for stream handling) Installed
  • xdg-desktop-portal-hyprland (runtime requirement - provides hyprland-share-picker)
  • spa-utils (for spa_hook structure)

Implementation Tasks:

  • Switch from libdbus-1 to sdbus-c++
  • Use standard org.freedesktop.portal.Desktop.ScreenCast interface
  • Update CMakeLists.txt with pkg-config for sdbus-c++ and pipewire
  • Screen capture class structure with backend selection
  • initPortalConnection() - sdbus session bus connection
  • cleanupPortalConnection() - Resource cleanup
  • createPortalSession() - Use sdbus-c++ for CreateSession method with unique handles
  • selectPortalSources() - Use sdbus-c++ for SelectSources (portal shows hyprland-share-picker)
  • startPortalSession() - Use sdbus-c++ for Start method
  • openPipeWireRemote() - Get file descriptor from portal using UnixFd
  • getStreamsNodeId() - Query session Streams property to get actual node_id
  • initPipewire() - Complete implementation with stream connection, listeners, and thread loop
  • onStreamProcess() - Frame callback implementation that dequeues buffers and invokes user callback
  • onStreamParamChanged() - Handle resolution/format changes and update dimensions
  • cleanupPipewire() - Stop thread loop and cleanup resources properly
  • UI Integration - Add floating screen share button to VideoGridWidget
  • Test end-to-end screen capture flow
  • Frame buffer memory management optimization
  • Error handling and session recovery
  • Restore token support for session persistence

UI Integration Details:

  • Added "Share Screen" button to VideoGridWidget
  • Button floats at bottom center of video grid, above video streams
  • Positioned via resize events to stay centered
  • Styled with Discord-like blue theme
  • Toggles between "Share Screen" (blue) and "Stop Sharing" (red)
  • Clicking button calls ScreenCapture::start() which opens xdg-desktop-portal dialog
  • ScreenCapture instance managed by VideoGridWidget
  • Signal screenShareRequested() emitted when sharing starts

Notes:

  • Standard portal API is service name: org.freedesktop.portal.Desktop
  • Object path: /org/freedesktop/portal/desktop
  • Interface: org.freedesktop.portal.ScreenCast
  • When SelectSources is called, xdph automatically launches hyprland-share-picker GUI
  • User selection is handled transparently - we just get back session handle + Pipewire node

Technical Details:

// XDG Desktop Portal ScreenCast API Workflow:
// 1. org.freedesktop.portal.ScreenCast.CreateSession(options) -> session_handle
//    - Creates session object for this screen cast
//
// 2. org.freedesktop.portal.ScreenCast.SelectSources(session_handle, options)
//    - options.types: MONITOR(1), WINDOW(2), VIRTUAL(4)
//    - options.multiple: allow selecting multiple sources
//    - options.cursor_mode: Hidden(1), Embedded(2), Metadata(4)
//    - options.persist_mode: DoNotPersist(0), WhileRunning(1), UntilRevoked(2)
//
// 3. org.freedesktop.portal.ScreenCast.Start(session_handle, parent_window, options)
//    - User selects screen/window via portal UI
//    - Response includes: streams array with [(node_id, properties)]
//    - Each stream has: id, position, size, source_type
//    - Returns restore_token for future sessions
//
// 4. org.freedesktop.portal.ScreenCast.OpenPipeWireRemote(session_handle) -> fd
//    - Returns file descriptor for PipeWire connection
//
// 5. Pipewire Connection:
//    - pw_context_connect_fd(fd) creates pw_core
//    - pw_stream_new() with node_id from Step 3
//    - pw_stream_add_listener() for frame callbacks
//    - pw_stream_connect() to start streaming
//
// 6. Frame Processing:
//    - on_process() callback receives spa_buffer with frame data
//    - Extract video/raw format (RGB, YUV, etc.)
//    - Invoke FrameCallback with decoded data

X11 Implementation (Fallback)

Status: 🔴 Not Started

Dependencies:

  • FFmpeg (libavformat, libavcodec, libavutil, libavdevice)
  • X11 libraries

Implementation Tasks:

  • FFmpeg context initialization
  • x11grab input device configuration
  • Frame extraction and decoding
  • Frame callback integration
  • Display selection (multi-monitor support)

Windows Implementation (Future)

Status: 🔴 Not Started

Dependencies:

  • FFmpeg with GDI support

Implementation Tasks:

  • FFmpeg GDI grabber setup
  • Frame processing pipeline
  • Display enumeration

Testing Plan

Unit Tests

  • Backend detection on different environments
  • Frame callback invocation
  • Start/stop lifecycle
  • Memory leak verification

Integration Tests

  • Wayland/Hyprland capture on real desktop
  • X11 capture verification
  • Multi-monitor scenarios
  • Permission denial handling

Performance Tests

  • Frame rate consistency (target: 30 FPS)
  • CPU usage profiling
  • Memory usage under continuous capture

Known Issues & Limitations

Current

  • All backends are stubs (no actual implementation)
  • No frame encoding/compression
  • No multi-monitor selection UI

Future Considerations

  • Portal permissions may require user interaction each session
  • Hyprland-specific optimizations possible via hyprland-share-picker
  • Frame rate limiting needed to prevent CPU overload
  • Consider hardware encoding for lower CPU usage

Code Locations

  • Header: client/media/screen_capture.h
  • Implementation: client/media/screen_capture.cpp
  • Dependencies: CMakeLists.txt (client section)

Next Steps

  1. Implement Pipewire + Portal screen capture for Wayland/Hyprland
  2. Test on Hyprland environment
  3. Implement X11 fallback
  4. Add frame encoding for network transmission
  5. Integrate with video streaming protocol

Session Log

Session 1 - December 7, 2025

  • Completed: Authentication system fully working (plaintext → salt → argon2 verification)
  • Fixed: Message deserialization bug (async buffer capture issue in both client and server)
  • Status: Ready to begin screen capture implementation
  • Decision: Prioritize Wayland/Hyprland implementation due to target environment

Session 2 - December 7, 2025 (Current)

  • Researched: xdg-desktop-portal-hyprland specifications and org.freedesktop.portal.ScreenCast API
  • Implemented:
    • Screen capture header with forward declarations for DBus/Pipewire types
    • Basic structure with backend detection and selection
    • DBus initialization and cleanup functions
    • Pipewire initialization skeleton (loop, context creation)
    • Platform-specific compilation (#ifdef linux)
    • startPortalCapture() workflow outline (6-step process)
  • TODO: Implement actual DBus method calls for portal communication
  • Next: Implement createPortalSession() with proper DBus message building