Aperture UI
Engineer Notes

Command Executor System

Command Executor System

Overview

The Command Executor System is a high-performance, low-overhead command pattern implementation designed for APHTML's layout, rendering, and scripting operations. It provides frame-based memory allocation, lock-free submission paths, dependency tracking, and pluggable backend execution strategies.

The system decouples command creation from execution, enabling deterministic layout passes, parallel work submission, and integration with the nsTaskSystem for multi-threaded workloads.

Architecture

Core Components

1. IAPCCommand

Location: Code/ApertureUISource/APHTML/CommandExecutor/IAPCCommand.h

Base interface for all commands in the system. Commands encapsulate units of work that can be executed on different threads with configurable priorities and dependencies.

Key Features:

  • Command Types: CSS, Composition, Layout, Rendering, Scripting, Presentation, Custom
  • Run Types: Thread affinity control (AnyThread, FreeThread_CSS, FreeThread_Layout, etc.)
  • Priority System: Immediate, High, Normal, Low, Deferred
  • Dependency Tracking: Commands can depend on other commands via AddDependency()
  • Cancellation Support: Commands can be canceled before or during execution
  • Function Wrapper: Supports lambda capture via SetFunction()

Priority Mapping:

CommandPriority::Immediate  → Executes synchronously during Flush()
CommandPriority::High       → nsTaskPriority::EarlyThisFrame
CommandPriority::Normal     → nsTaskPriority::ThisFrame (default)
CommandPriority::Low        → nsTaskPriority::LateThisFrame
CommandPriority::Deferred   → nsTaskPriority::NextFrame

Example:

auto* pCommand = CommandArena::GetInstance().Allocate<IAPCCommand>(
    CommandType::Layout, 
    Runtype::FreeThread_Layout, 
    CommandPriority::High
);

pCommand->SetFunction([element]() {
    // Layout work here
});

APCCommandSubmitter::GetInstance().Submit(pCommand);

2. IAPCCommandList

Location: Code/ApertureUISource/APHTML/CommandExecutor/IAPCCommandList.h

Container for grouping related commands. Command lists belong to a queue and can be verified/committed as a unit.

Features:

  • Aggregates multiple commands
  • Type and runtype classification
  • Batch verification via VerifyAndCommitCommands()
  • Association with parent IAPCCommandQueue

3. IAPCCommandQueue

Location: Code/ApertureUISource/APHTML/CommandExecutor/IAPCCommandQueue.h

Queue interface for managing command list execution. Supports both resident command lists (persistent across frames) and lone command lists (single-use).

Operations:

  • AddCommandList(): Adds a persistent command list
  • AddLoneCommandList(): Adds a single-use command list
  • Execute(): Executes all queued command lists with optional priority sorting
  • ClearQueue(): Removes all command lists
  • Thread-safe via internal mutex (m_mutex)

4. APCCommandSubmitter

Location: Code/ApertureUISource/APHTML/CommandExecutor/APCCommandSubmitter.h

Global singleton providing fast command submission with a lock-free ring-buffer design. This is the primary API for submitting commands from any subsystem.

Design:

  • Ring Buffer: 1024-slot lock-free circular buffer for low-contention submission
  • Overflow Array: Mutex-protected fallback when ring buffer is full
  • Backend Abstraction: Delegates execution to pluggable IAPCCommandQueueBackend
  • Immediate Execution: Commands with Immediate priority execute synchronously during Flush()

Usage Pattern:

// Initialization (optional custom backend)
APCCommandSubmitter::SetBackendFactory([]() {
    return nsUniquePtr<IAPCCommandQueueBackend>(NS_DEFAULT_NEW(CustomBackend));
});

// Submit commands (lock-free fast path)
auto& submitter = APCCommandSubmitter::GetInstance();
submitter.Submit(layoutCommand);
submitter.Submit(renderCommand);

// Execute all submitted commands
submitter.Flush();

// Optional: block until complete
submitter.WaitForAll();

// Frame boundary reset
submitter.Reset();

Performance Monitoring:

nsUInt32 overflows = submitter.GetOverflowCount();
if (overflows > 0) {
    // Ring buffer too small, consider increasing kRingBufferCapacity
}
submitter.ResetOverflowCount();

5. IAPCCommandQueueBackend

Location: Code/ApertureUISource/APHTML/CommandExecutor/IAPCCommandQueueBackend.h

Abstract interface for backend execution strategies. Enables custom threading models without changing command submission code.

Required Operations:

  • Submit(IAPCCommand*): Queue a command for execution
  • Flush(): Begin executing all submitted commands
  • WaitForAll(): Block until all commands complete
  • Cancel(IAPCCommand*): Attempt to cancel a command
  • Reset(): Reset backend state for new frame

6. APCTaskSystemBackend (Default)

Location: Code/ApertureUISource/APHTML/CommandExecutor/APCTaskSystemBackend.h

Default backend providing deterministic, synchronous execution in submission order. This is critical for layout operations where parent elements must be processed before children.

Execution Strategy:

  1. Commands are collected during Submit() calls
  2. Flush() sorts commands by priority
  3. Commands execute synchronously in topological dependency order
  4. Canceled commands are skipped
  5. All work completes before Flush() returns

Note: Despite the name "TaskSystemBackend", this backend executes synchronously to maintain layout correctness. Future backends could leverage nsTaskSystem for parallel execution where dependency ordering allows.

7. APCCommandTask

Location: Code/ApertureUISource/APHTML/CommandExecutor/APCCommandTask.h

Internal nsTask wrapper that bridges IAPCCommand with nsTaskSystem. Allocated from CommandArena for frame-based deallocation.

Lifecycle:

auto* pTask = CommandArena::GetInstance().Allocate<APCCommandTask>(pCommand);
pTask->ConfigureTask("APCCommandTask", nsTaskNesting::Never);
nsTaskSystem::StartSingleTask(pTask, taskPriority);

8. CommandArena

Location: Code/ApertureUISource/APHTML/CommandExecutor/CommandArena.h

Frame-based linear allocator for command objects. Provides fast, cache-coherent allocation with bulk deallocation at frame boundaries.

Design:

  • Wraps nsDoubleBufferedLinearAllocator
  • Allocations survive the current frame
  • No per-object deallocation overhead
  • All allocations freed during Swap() at frame end

Lifecycle:

// Startup (called during APHTML initialization)
CommandArena::Startup();

// Per-frame allocation
auto* pCommand = CommandArena::GetInstance().Allocate<LayoutBlockCommand>(
    pEngine, pElement, availableSpace
);

// Frame boundary (after nsFrameAllocator::Swap())
CommandArena::Swap();  // Bulk-frees previous frame's allocations

// Shutdown
CommandArena::Shutdown();

Thread Safety: Allocate() is NOT thread-safe. Commands should be allocated on the thread that creates them (typically main thread).

9. IAPCJobHandle

Location: Code/ApertureUISource/APHTML/CommandExecutor/IAPCJobHandle.h

Handle for tracking asynchronous command completion. Wraps std::shared_future<void> for polling or blocking on command results.

Usage:

IAPCJobHandle handle = GetJobHandleFromAsyncOperation();

// Poll for completion
if (handle.IsComplete()) {
    // Job finished
}

// Block until complete
handle.Wait();

Typed Variant:

IAPCTypedJobHandle<nsVec2> sizeHandle = ComputeIntrinsicSizeAsync();
nsVec2 result = sizeHandle.Get();  // Blocks until result available

Specialized Commands

Layout Commands

Location: Code/ApertureUISource/APHTML/CommandExecutor/LayoutCommands.h

Concrete command implementations for layout operations:

  • LayoutStyleCommand: Computes used CSS property values (Priority: High)
  • LayoutIntrinsicSizeCommand: Computes min/max content sizes (Priority: High)
  • LayoutBlockCommand: Block formatting context layout (Priority: Normal)
  • LayoutInlineCommand: Inline formatting context layout (Priority: Normal)
  • LayoutFlexCommand: Flexbox layout algorithm (Priority: Normal)
  • LayoutGridCommand: CSS Grid layout algorithm (Priority: Normal)
  • LayoutPositioningCommand: Applies positioning offsets (Priority: Low)

Common Pattern:

class LayoutBlockCommand : public LayoutCommandBase
{
public:
    LayoutBlockCommand(
        layout::LayoutEngine* pEngine,
        Element* pElement,
        const layout::AvailableSpace& availableSpace)
        : LayoutCommandBase(pEngine, pElement, availableSpace, CommandPriority::Normal)
    {
    }

    void Execute() override;
};

All layout commands write results directly to the element's LayoutBox structure.

Execution Flow

Typical Frame Lifecycle

// 1. Frame Start
CommandArena::Swap();  // Free previous frame's commands

// 2. Command Submission Phase
auto& submitter = APCCommandSubmitter::GetInstance();

for (auto* element : dirtyElements) {
    auto* cmd = CommandArena::GetInstance().Allocate<LayoutBlockCommand>(
        layoutEngine, element, availableSpace
    );
    submitter.Submit(cmd);
}

// 3. Execution Phase
submitter.Flush();      // Executes Immediate priority inline
submitter.WaitForAll(); // Blocks until all commands complete

// 4. Continue with rendering, etc.

Dependency Execution

Commands with dependencies are automatically deferred until dependencies complete:

auto* parentCmd = CommandArena::GetInstance().Allocate<IAPCCommand>();
auto* childCmd = CommandArena::GetInstance().Allocate<IAPCCommand>();

// Child depends on parent
childCmd->AddDependency(parentCmd);

submitter.Submit(parentCmd);
submitter.Submit(childCmd);
submitter.Flush();  // Executes parent first, then child

The backend tracks m_pendingDependencies atomically and decrements via OnDependencyFinished() callbacks.

Integration Points

Layout Engine

File: Code/ApertureUISource/APHTML/Layout/LayoutEngine.cpp

Layout operations submit commands for tree traversal:

void LayoutEngine::PerformLayout(Element* pElement) {
    auto& submitter = APCCommandSubmitter::GetInstance();
    
    // Submit style computation (high priority)
    auto* styleCmd = CommandArena::GetInstance().Allocate<LayoutStyleCommand>(
        this, pElement, availableSpace
    );
    submitter.Submit(styleCmd);
    
    // Submit layout command (normal priority, depends on style)
    auto* layoutCmd = CommandArena::GetInstance().Allocate<LayoutBlockCommand>(
        this, pElement, availableSpace
    );
    layoutCmd->AddDependency(styleCmd);
    submitter.Submit(layoutCmd);
    
    submitter.Flush();
}

Scripting/V8 Integration

File: Code/ApertureUISource/APHTML/V8Engine/Core/JSExecCommand.h

JavaScript execution commands integrate with the command system for async operations and event handling.

Standard Command Executor

File: Code/ApertureUISource/APHTML/Backend/IAPCStandardCommandExecutor.h

Higher-level executor interface for view-based command queue management. Used by UIView to coordinate command queues across multiple subsystems.

Performance Considerations

Memory Efficiency

  • Frame Allocation: All commands allocated from CommandArena are bulk-freed at frame boundaries
  • Zero Fragmentation: Linear allocator eliminates heap fragmentation
  • Cache Coherency: Sequential allocation improves cache locality

Submission Performance

  • Lock-Free Path: Ring buffer enables contention-free submission from any thread
  • Overflow Tracking: Monitor GetOverflowCount() to detect capacity issues
  • Capacity Tuning: kRingBufferCapacity = 1024 (must be power of two)

Execution Performance

  • Priority Scheduling: High-priority commands execute before low-priority
  • Immediate Execution: Bypass task system overhead for latency-sensitive work
  • Dependency Graph: Automatic topological sorting ensures correct order

Custom Backend Example

class MyParallelBackend : public IAPCCommandQueueBackend
{
public:
    void Submit(IAPCCommand* pCommand) override {
        m_commands.PushBack(pCommand);
    }

    void Flush() override {
        // Execute Immediate priority synchronously
        for (auto* cmd : m_commands) {
            if (cmd->GetPriority() == CommandPriority::Immediate) {
                cmd->Execute();
            }
        }

        // Submit others to nsTaskSystem in parallel
        for (auto* cmd : m_commands) {
            if (cmd->GetPriority() != CommandPriority::Immediate) {
                auto* task = CommandArena::GetInstance().Allocate<APCCommandTask>(cmd);
                nsTaskSystem::StartSingleTask(task, 
                    CommandPriorityToTaskPriority(cmd->GetPriority()));
                m_tasks.PushBack(nsTaskGroupID(task->GetTaskID()));
            }
        }
    }

    void WaitForAll() override {
        nsTaskSystem::WaitForGroup(m_tasks);
        m_tasks.Clear();
    }

    bool Cancel(IAPCCommand* pCommand) override {
        pCommand->Cancel();
        return m_commands.RemoveAndCopy(pCommand);
    }

    void Reset() override {
        m_commands.Clear();
        m_tasks.Clear();
    }

private:
    nsDynamicArray<IAPCCommand*> m_commands;
    nsDynamicArray<nsTaskGroupID> m_tasks;
};

// Register at initialization
APCCommandSubmitter::SetBackendFactory([]() {
    return nsUniquePtr<IAPCCommandQueueBackend>(NS_DEFAULT_NEW(MyParallelBackend));
});

Command Type Reference

CommandType Enum

enum class CommandType {
    Unknown,        // Unspecified
    CSS,            // Style computation
    Composition,    // Compositor operations
    Layout,         // Layout tree operations
    Rendering,      // Render data generation
    Scripting,      // JavaScript execution
    Presentation,   // Display/present operations
    Custom          // User-defined
};

Runtype Enum

enum class Runtype {
    AnyThread,                  // No thread affinity
    FreeThread_CSS,             // CSS subsystem thread
    FreeThread_Composition,     // Compositor thread
    FreeThread_Layout,          // Layout thread
    FreeThread_Rendering,       // Render thread
    FreeThread_Scripting,       // Script thread
    FreeThread_Presentation,    // Present thread
    FreeThread_Custom           // Custom thread pool
};

Note: Current implementation treats all runtypes as AnyThread. Future extensions may enforce thread affinity based on Runtype.

Thread Safety

Thread-Safe Operations

  • APCCommandSubmitter::Submit(): Lock-free ring buffer (or mutex-protected overflow)
  • APCCommandSubmitter::Flush(): Thread-safe
  • IAPCCommandQueue::RequestLock(): Mutex-protected queue operations
  • IAPCCommand::OnDependencyFinished(): Atomic decrement

NOT Thread-Safe

  • CommandArena::Allocate(): Must be called from allocation thread (typically main)
  • Individual command execution: Commands are responsible for their own thread safety

Best Practices

  1. Always Allocate from Arena: Use CommandArena::GetInstance().Allocate<T>() for automatic memory management
  2. Respect Dependencies: Use AddDependency() for parent-child relationships in layout
  3. Choose Correct Priority: Immediate for latency-critical, Normal for standard work
  4. Monitor Overflow: Check GetOverflowCount() periodically; increase ring buffer if needed
  5. Call Reset() at Frame Boundaries: Paired with CommandArena::Swap() after frame completion
  6. Avoid Long-Running Commands: Commands block the backend; split large work into smaller chunks
  7. Use Cancellation: Check HasBeenCanceled() in long-running commands for early exit

Debugging

Profiling

NS_PROFILE_SCOPE("APCCommandSubmitter.Flush");  // Already instrumented

Logging Command Execution

void MyCommand::Execute() override {
    nsLog::Info("Executing {} command on element {}", 
        CommandTypeToString(GetCommandType()), 
        m_pElement->GetDebugName());
    
    // Command logic...
}

Dependency Debugging

const auto& deps = myCommand->GetDependencies();
nsLog::Debug("Command has {} dependencies", deps.GetCount());
for (auto* dep : deps) {
    nsLog::Debug("  Depends on: {}", 
        CommandTypeToString(dep->GetCommandType()));
}

Future Enhancements

Potential Improvements

  1. True Parallel Execution: Implement a backend that leverages nsTaskSystem for independent commands
  2. Thread Affinity: Enforce Runtype to dedicate threads for CSS, Layout, Scripting subsystems
  3. Command Recording: Capture command streams for replay/debugging
  4. GPU Command Integration: Extend pattern to graphics API command buffers
  5. Command Merging: Coalesce redundant commands before execution

Performance Optimization Opportunities

  • Increase ring buffer capacity for high-command-rate scenarios
  • SIMD-optimized dependency resolution
  • Lock-free overflow array using atomic linked list
  • Per-thread submission buffers to reduce contention
  • Layout System: README_HTML.md - How layout commands integrate with DOM tree
  • Render Path: README_RenderPath.md - Command execution for render data generation
  • Task System: Foundation/Threading/Implementation/TaskSystemDeclarations.h - nsTaskSystem integration

Summary

The Command Executor System provides a robust, high-performance foundation for APHTML's multi-threaded architecture. Its frame-based allocation, lock-free submission, and pluggable backend design enable efficient layout, rendering, and scripting operations with minimal overhead and deterministic execution semantics.

Key Takeaways:

  • Fast Submission: Lock-free ring buffer for low-latency command enqueueing
  • Deterministic Execution: Default backend ensures parent-before-child processing for layout
  • Frame-Based Memory: Zero-overhead allocation/deallocation via CommandArena
  • Dependency Tracking: Automatic topological sorting for complex command graphs
  • Extensible: Custom backends for specialized threading models

Use APCCommandSubmitter::GetInstance().Submit() for all command submission and CommandArena::GetInstance().Allocate() for all command allocation to ensure optimal performance and correctness.

Copyright © 2026