194 lines
7.3 KiB
Markdown
194 lines
7.3 KiB
Markdown
|
|
# Screen Capture Implementation Progress
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
Implementation of screen capture functionality for SCAR Chat with support for multiple backends:
|
||
|
|
- **Wayland/Hyprland**: xdg-desktop-portal + Pipewire
|
||
|
|
- **X11**: FFmpeg with x11grab
|
||
|
|
- **Windows**: FFmpeg with GDI
|
||
|
|
|
||
|
|
## Current Status: 🟡 In Progress
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
### Backend Detection
|
||
|
|
- ✅ Auto-detection via environment variables (WAYLAND_DISPLAY, HYPRLAND_INSTANCE_SIGNATURE)
|
||
|
|
- ✅ Fallback mechanism (Portal → X11 → Windows)
|
||
|
|
- ✅ Manual backend selection support
|
||
|
|
|
||
|
|
### Wayland/Hyprland Implementation (Priority)
|
||
|
|
**Status**: 🟡 In Progress - **CORRECTED TO USE HYPRLAND PORTAL**
|
||
|
|
|
||
|
|
**Critical Architecture Understanding**:
|
||
|
|
- xdg-desktop-portal-hyprland **implements** org.freedesktop.portal.ScreenCast (standard API)
|
||
|
|
- We use **sdbus-c++** library (NOT libdbus-1) to communicate with the portal
|
||
|
|
- The portal handles Hyprland-specific details internally (via hyprland-share-picker)
|
||
|
|
- From client perspective: call standard portal API → portal shows hyprland-share-picker → get Pipewire stream
|
||
|
|
|
||
|
|
**Dependencies**:
|
||
|
|
- **sdbus-c++** (required for DBus communication) ✅ Installed
|
||
|
|
- **libpipewire-0.3** (for stream handling) ✅ Installed
|
||
|
|
- **xdg-desktop-portal-hyprland** (runtime requirement - provides hyprland-share-picker)
|
||
|
|
- **spa-utils** (for spa_hook structure)
|
||
|
|
|
||
|
|
**Implementation Tasks**:
|
||
|
|
- [x] Switch from libdbus-1 to sdbus-c++
|
||
|
|
- [x] Use standard org.freedesktop.portal.Desktop.ScreenCast interface
|
||
|
|
- [x] Update CMakeLists.txt with pkg-config for sdbus-c++ and pipewire
|
||
|
|
- [x] Screen capture class structure with backend selection
|
||
|
|
- [x] initPortalConnection() - sdbus session bus connection
|
||
|
|
- [x] cleanupPortalConnection() - Resource cleanup
|
||
|
|
- [x] createPortalSession() - Use sdbus-c++ for CreateSession method with unique handles
|
||
|
|
- [x] selectPortalSources() - Use sdbus-c++ for SelectSources (portal shows hyprland-share-picker)
|
||
|
|
- [x] startPortalSession() - Use sdbus-c++ for Start method
|
||
|
|
- [x] openPipeWireRemote() - Get file descriptor from portal using UnixFd
|
||
|
|
- [x] getStreamsNodeId() - Query session Streams property to get actual node_id
|
||
|
|
- [x] initPipewire() - Complete implementation with stream connection, listeners, and thread loop
|
||
|
|
- [x] onStreamProcess() - Frame callback implementation that dequeues buffers and invokes user callback
|
||
|
|
- [x] onStreamParamChanged() - Handle resolution/format changes and update dimensions
|
||
|
|
- [x] cleanupPipewire() - Stop thread loop and cleanup resources properly
|
||
|
|
- [ ] Test end-to-end screen capture flow
|
||
|
|
- [ ] Frame buffer memory management optimization
|
||
|
|
- [ ] Error handling and session recovery
|
||
|
|
- [ ] Restore token support for session persistence
|
||
|
|
|
||
|
|
**Notes**:
|
||
|
|
- Standard portal API is service name: `org.freedesktop.portal.Desktop`
|
||
|
|
- Object path: `/org/freedesktop/portal/desktop`
|
||
|
|
- Interface: `org.freedesktop.portal.ScreenCast`
|
||
|
|
- When SelectSources is called, xdph automatically launches hyprland-share-picker GUI
|
||
|
|
- User selection is handled transparently - we just get back session handle + Pipewire node
|
||
|
|
|
||
|
|
**Technical Details**:
|
||
|
|
```cpp
|
||
|
|
// XDG Desktop Portal ScreenCast API Workflow:
|
||
|
|
// 1. org.freedesktop.portal.ScreenCast.CreateSession(options) -> session_handle
|
||
|
|
// - Creates session object for this screen cast
|
||
|
|
//
|
||
|
|
// 2. org.freedesktop.portal.ScreenCast.SelectSources(session_handle, options)
|
||
|
|
// - options.types: MONITOR(1), WINDOW(2), VIRTUAL(4)
|
||
|
|
// - options.multiple: allow selecting multiple sources
|
||
|
|
// - options.cursor_mode: Hidden(1), Embedded(2), Metadata(4)
|
||
|
|
// - options.persist_mode: DoNotPersist(0), WhileRunning(1), UntilRevoked(2)
|
||
|
|
//
|
||
|
|
// 3. org.freedesktop.portal.ScreenCast.Start(session_handle, parent_window, options)
|
||
|
|
// - User selects screen/window via portal UI
|
||
|
|
// - Response includes: streams array with [(node_id, properties)]
|
||
|
|
// - Each stream has: id, position, size, source_type
|
||
|
|
// - Returns restore_token for future sessions
|
||
|
|
//
|
||
|
|
// 4. org.freedesktop.portal.ScreenCast.OpenPipeWireRemote(session_handle) -> fd
|
||
|
|
// - Returns file descriptor for PipeWire connection
|
||
|
|
//
|
||
|
|
// 5. Pipewire Connection:
|
||
|
|
// - pw_context_connect_fd(fd) creates pw_core
|
||
|
|
// - pw_stream_new() with node_id from Step 3
|
||
|
|
// - pw_stream_add_listener() for frame callbacks
|
||
|
|
// - pw_stream_connect() to start streaming
|
||
|
|
//
|
||
|
|
// 6. Frame Processing:
|
||
|
|
// - on_process() callback receives spa_buffer with frame data
|
||
|
|
// - Extract video/raw format (RGB, YUV, etc.)
|
||
|
|
// - Invoke FrameCallback with decoded data
|
||
|
|
```
|
||
|
|
|
||
|
|
### X11 Implementation (Fallback)
|
||
|
|
**Status**: 🔴 Not Started
|
||
|
|
|
||
|
|
**Dependencies**:
|
||
|
|
- FFmpeg (libavformat, libavcodec, libavutil, libavdevice)
|
||
|
|
- X11 libraries
|
||
|
|
|
||
|
|
**Implementation Tasks**:
|
||
|
|
- [ ] FFmpeg context initialization
|
||
|
|
- [ ] x11grab input device configuration
|
||
|
|
- [ ] Frame extraction and decoding
|
||
|
|
- [ ] Frame callback integration
|
||
|
|
- [ ] Display selection (multi-monitor support)
|
||
|
|
|
||
|
|
### Windows Implementation (Future)
|
||
|
|
**Status**: 🔴 Not Started
|
||
|
|
|
||
|
|
**Dependencies**:
|
||
|
|
- FFmpeg with GDI support
|
||
|
|
|
||
|
|
**Implementation Tasks**:
|
||
|
|
- [ ] FFmpeg GDI grabber setup
|
||
|
|
- [ ] Frame processing pipeline
|
||
|
|
- [ ] Display enumeration
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Testing Plan
|
||
|
|
|
||
|
|
### Unit Tests
|
||
|
|
- [ ] Backend detection on different environments
|
||
|
|
- [ ] Frame callback invocation
|
||
|
|
- [ ] Start/stop lifecycle
|
||
|
|
- [ ] Memory leak verification
|
||
|
|
|
||
|
|
### Integration Tests
|
||
|
|
- [ ] Wayland/Hyprland capture on real desktop
|
||
|
|
- [ ] X11 capture verification
|
||
|
|
- [ ] Multi-monitor scenarios
|
||
|
|
- [ ] Permission denial handling
|
||
|
|
|
||
|
|
### Performance Tests
|
||
|
|
- [ ] Frame rate consistency (target: 30 FPS)
|
||
|
|
- [ ] CPU usage profiling
|
||
|
|
- [ ] Memory usage under continuous capture
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Known Issues & Limitations
|
||
|
|
|
||
|
|
### Current
|
||
|
|
- All backends are stubs (no actual implementation)
|
||
|
|
- No frame encoding/compression
|
||
|
|
- No multi-monitor selection UI
|
||
|
|
|
||
|
|
### Future Considerations
|
||
|
|
- Portal permissions may require user interaction each session
|
||
|
|
- Hyprland-specific optimizations possible via hyprland-share-picker
|
||
|
|
- Frame rate limiting needed to prevent CPU overload
|
||
|
|
- Consider hardware encoding for lower CPU usage
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Code Locations
|
||
|
|
- **Header**: `client/media/screen_capture.h`
|
||
|
|
- **Implementation**: `client/media/screen_capture.cpp`
|
||
|
|
- **Dependencies**: `CMakeLists.txt` (client section)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Next Steps
|
||
|
|
1. Implement Pipewire + Portal screen capture for Wayland/Hyprland
|
||
|
|
2. Test on Hyprland environment
|
||
|
|
3. Implement X11 fallback
|
||
|
|
4. Add frame encoding for network transmission
|
||
|
|
5. Integrate with video streaming protocol
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Session Log
|
||
|
|
|
||
|
|
### Session 1 - December 7, 2025
|
||
|
|
- **Completed**: Authentication system fully working (plaintext → salt → argon2 verification)
|
||
|
|
- **Fixed**: Message deserialization bug (async buffer capture issue in both client and server)
|
||
|
|
- **Status**: Ready to begin screen capture implementation
|
||
|
|
- **Decision**: Prioritize Wayland/Hyprland implementation due to target environment
|
||
|
|
|
||
|
|
### Session 2 - December 7, 2025 (Current)
|
||
|
|
- **Researched**: xdg-desktop-portal-hyprland specifications and org.freedesktop.portal.ScreenCast API
|
||
|
|
- **Implemented**:
|
||
|
|
- Screen capture header with forward declarations for DBus/Pipewire types
|
||
|
|
- Basic structure with backend detection and selection
|
||
|
|
- DBus initialization and cleanup functions
|
||
|
|
- Pipewire initialization skeleton (loop, context creation)
|
||
|
|
- Platform-specific compilation (#ifdef __linux__)
|
||
|
|
- startPortalCapture() workflow outline (6-step process)
|
||
|
|
- **TODO**: Implement actual DBus method calls for portal communication
|
||
|
|
- **Next**: Implement createPortalSession() with proper DBus message building
|