Execution Management & State Management

About 1421 wordsAbout 5 min

2026-03-25

Overview

Two Functional Clusters jointly govern the lifecycle of all software on an AUTOSAR Adaptive machine:

Cluster	Short Name	Responsibility
Execution Management	EM	Start, monitor, and terminate individual Adaptive Applications (POSIX processes)
State Management	SM	Orchestrate system-level state transitions that determine which applications are running

They work in a hierarchical relationship: SM decides what should run → EM makes it happen.

Execution Management (EM) in Depth

Core Responsibility

EM is the first AUTOSAR Adaptive service to start after the OS boots. It reads the Execution Manifest of every deployed Adaptive Application and:

Determines which processes to launch during startup
Forks and executes each process (fork() + exec())
Monitors process health
Applies recovery actions on process failure
Terminates processes gracefully during shutdown

Process Model

Each Adaptive Application (AA) runs as a separate POSIX process. This is a fundamental architectural principle:

Machine
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Execution Management Process (PID 2, elevated privilege)
    │
    ├── AA: RadarProcessing   (PID 101, /opt/radar/bin/radar_proc)
    ├── AA: PathPlanning      (PID 102, /opt/path/bin/path_plan)
    ├── AA: DiagnosticsApp    (PID 103, /opt/diag/bin/diag_app)
    └── AA: OTAManager        (PID 104, /opt/ota/bin/ota_mgr)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  AUTOSAR Adaptive Platform (Middleware)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  POSIX OS (Linux / QNX)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Process isolation means:

Memory faults in one AA cannot corrupt another AA's memory
EM can kill and restart a misbehaving AA without affecting the platform
Each AA runs with minimum required OS permissions (least privilege principle)

Application Process Lifecycle State Machine

               ┌─────────────┐
    Start()    │             │
  ─────────────►  Initializing│
               │             │
               └──────┬──────┘
                      │ ReportApplicationState(kRunning)
                      ▼
               ┌─────────────┐
               │             │
               │   Running   │◄─────────────────────────┐
               │             │                           │
               └──────┬──────┘                   Restart (if recovery
                      │                                policy = restart)
          SIGTERM      │  ReportApplicationState(kTerminating)
          received     │  or EM requests shutdown
                      ▼
               ┌─────────────┐
               │             │
               │ Terminating │
               │             │
               └──────┬──────┘
                      │ process exits (exit(0))
                      ▼
               ┌─────────────┐
               │  Terminated │  EM records exit, decides next action
               └─────────────┘

Application Responsibility: Applications must call ara::exec::ApplicationClient::ReportApplicationState() to inform EM of their state transitions. EM waits for each state report within a configured timeout.

#include <ara/exec/application_client.h>

int main() {
    ara::exec::ApplicationClient app_client;
    
    // --- Initialization Phase ---
    // Set up threads, connect to services, load config
    InitializeHardware();
    LoadCalibrationData();
    ConnectToServices();
    
    // Signal to EM that we're ready
    app_client.ReportApplicationState(ara::exec::ApplicationState::kRunning);
    
    // --- Running Phase ---
    while (!shutdown_requested) {
        DoWork();
        
        // Check for shutdown signal
        if (ShouldShutdown()) {
            break;
        }
    }
    
    // --- Termination Phase ---
    app_client.ReportApplicationState(ara::exec::ApplicationState::kTerminating);
    
    // Clean up resources — must complete before OS terminates the process
    CleanupHardware();
    FlushLogs();
    
    return 0;  // EM observes clean exit
}

EM Timeout Handling

EM enforces timeouts on each state transition:

  EM action                    What happens on timeout
  ─────────────────────────────────────────────────────────────────────
  Wait for kRunning report     EM marks AA as failed; applies recovery
  Wait for kTerminating report After SIGTERM + timeout → sends SIGKILL
  Wait for process exit        After SIGKILL + timeout → marks as failed

Timeouts are configured in the Execution Manifest:

<PROCESS>
  <SHORT-NAME>RadarProcessingMain</SHORT-NAME>
  <STARTUP-PROCESS-TIMEOUT>5000</STARTUP-PROCESS-TIMEOUT>  <!-- ms -->
  <SHUTDOWN-PROCESS-TIMEOUT>3000</SHUTDOWN-PROCESS-TIMEOUT>
</PROCESS>

Execution Manifest — Full Structure

The Execution Manifest is an ARXML file that EM reads to configure each process:

<ADAPTIVE-APPLICATION>
  <SHORT-NAME>RadarProcessingApp</SHORT-NAME>
  <CATEGORY>APPLICATION</CATEGORY>
  
  <PROCESS>
    <SHORT-NAME>RadarProcessingMain</SHORT-NAME>
    
    <!-- Binary to execute -->
    <EXECUTABLE>
      <SHORT-NAME>radar_proc</SHORT-NAME>
      <CODE-DESCRIPTOR>/opt/radar/bin/radar_proc</CODE-DESCRIPTOR>
    </EXECUTABLE>
    
    <!-- Command-line arguments -->
    <START-UP-OPTION>
      <SHORT-NAME>ConfigFile</SHORT-NAME>
      <OPTION-VALUE>--config=/etc/radar/radar.json</OPTION-VALUE>
    </START-UP-OPTION>
    
    <!-- Environment variables -->
    <ENV-VAR>
      <SHORT-NAME>LOG_LEVEL</SHORT-NAME>
      <VALUE>DEBUG</VALUE>
    </ENV-VAR>
    
    <!-- Scheduling configuration -->
    <SCHEDULING-POLICY>SCHED_FIFO</SCHEDULING-POLICY>
    <SCHEDULING-PRIORITY>60</SCHEDULING-PRIORITY>  <!-- 0=lowest, 99=highest for FIFO -->
    <CPU-AFFINITY>
      <CPU-CORE-ID>2</CPU-CORE-ID>
      <CPU-CORE-ID>3</CPU-CORE-ID>
    </CPU-AFFINITY>
    
    <!-- Resource limits -->
    <RESOURCE-GROUP>
      <SHORT-NAME>RadarResourceGroup</SHORT-NAME>
      <CPU-BUDGET>30</CPU-BUDGET>          <!-- 30% CPU budget -->
      <MEMORY-LIMIT>512000000</MEMORY-LIMIT>  <!-- 512 MB -->
      <MAX-FD-LIMIT>256</MAX-FD-LIMIT>
    </RESOURCE-GROUP>
    
    <!-- Startup timeout -->
    <STARTUP-PROCESS-TIMEOUT>5000</STARTUP-PROCESS-TIMEOUT>
    <SHUTDOWN-PROCESS-TIMEOUT>3000</SHUTDOWN-PROCESS-TIMEOUT>
    
    <!-- Function Group association -->
    <FUNCTION-GROUP-REF>/FunctionGroups/MachineFG</FUNCTION-GROUP-REF>
    <FUNCTION-GROUP-STATE-REF>/FunctionGroups/MachineFG/Running</FUNCTION-GROUP-STATE-REF>
  </PROCESS>
</ADAPTIVE-APPLICATION>

Recovery Actions

When a process fails (unexpected termination, crash, or timeout), EM applies a configured recovery action:

Recovery Policy         Behavior
──────────────────────────────────────────────────────────────────────────
NO_RESTART              Process is not restarted; EM notifies PHM and SM
RESTART_PROCESS         EM immediately restarts the process; limited retries
RESTART_WITH_BACKOFF    Restart with increasing delay (1s, 2s, 4s, 8s...)
ESCALATE_TO_SM          EM notifies SM, which may trigger a state change
              
Configured in Execution Manifest:
<RECOVERY-ACTION>
  <RECOVERY-POLICY>RESTART_WITH_BACKOFF</RECOVERY-POLICY>
  <MAX-RESTART-ATTEMPTS>3</MAX-RESTART-ATTEMPTS>
  <RESTART-BACKOFF-MS>1000</RESTART-BACKOFF-MS>
</RECOVERY-ACTION>

After exceeding max restart attempts, EM escalates to SM or enters a machine-level error state.

State Management (SM) in Depth

Role of SM

SM is the system-level orchestrator. It manages the high-level operational state of the entire Adaptive machine. SM:

Defines the Machine State (the overarching system mode)
Manages Function Groups (logical groups of related applications)
Requests EM to start/stop specific process groups based on state transitions
Handles recovery escalation from EM/PHM

Machine State

The Machine State represents the top-level operating mode of the vehicle ECU:

                          ┌─────────────┐
                          │   Startup   │  Entered immediately at OS boot
                          └──────┬──────┘
                                 │ (platform services initialized)
                                 ▼
                 ┌───────────────────────────────┐
                 │          Driving Mode         │  Normal vehicle operation
                 │  (Function Groups: All apps)  │
                 └─────────────┬─────────────────┘
              ┌────────────────┤
              │ OTA trigger    │ remote diagnostics trigger
              ▼                ▼
    ┌──────────────┐   ┌──────────────────────────────┐
    │ Update Mode  │   │    Diagnostic Mode           │
    │ (Only UCM,   │   │  (Full stack + Diag server)  │
    │  OTA apps)   │   │                              │
    └──────┬───────┘   └──────────────────────────────┘
           │ (update complete)
           ▼
    ┌──────────────┐
    │  Restart     │   Reboots the machine; returns to Startup on next boot
    └──────────────┘

    ┌──────────────┐
    │  Shutdown    │   Graceful system shutdown; EM terminates all processes
    └──────────────┘

SM standard states (from AUTOSAR specification):

State	Description
`Startup`	Initial state; platform services start; no user-facing functionality
`Driving`	Full vehicle operation; all AAs active
`Parking`	Reduced set of AAs; lower power; park assist features active
`OTA_Update`	Only UCM and Update-related AAs active
`Vehicle_Service`	Workshop diagnostic mode
`Shutdown`	Graceful stop of all processes → machine powers down
`Restart`	Graceful stop → reboot
`Error`	Unrecoverable error; limited function safe state

Function Groups and Function Group States

A Function Group is a logical grouping of related Adaptive Applications that are controlled as a unit. Every Adaptive Application belongs to at least one Function Group.

Machine
├── Function Group: MachineFG           (mandatory — represents the machine itself)
│   ├── State: Off          → No processes running
│   ├── State: Startup      → Platform services only
│   └── State: Running      → All normal-mode processes
│
├── Function Group: ADASFunctionGroup
│   ├── State: Off          → No ADAS processes
│   ├── State: Passive      → Monitoring only, no actuation
│   └── State: Active       → Full ADAS pipeline running
│
├── Function Group: OTAFunctionGroup
│   ├── State: Off          → UCM idle
│   └── State: Updating     → UCM + download agent active
│
└── Function Group: DiagFunctionGroup
    ├── State: Off          → Diag server stopped
    └── State: Active       → UDS Diagnostic server active

State transitions are expressed in ARXML:

<FUNCTION-GROUP>
  <SHORT-NAME>ADASFunctionGroup</SHORT-NAME>
  
  <FUNCTION-GROUP-STATE>
    <SHORT-NAME>Off</SHORT-NAME>
    <!-- No processes in Off state -->
  </FUNCTION-GROUP-STATE>
  
  <FUNCTION-GROUP-STATE>
    <SHORT-NAME>Passive</SHORT-NAME>
    <PROCESS-IN-MACHINE-STATE-IREF>
      <BASE-REF>/Apps/SensorFusion</BASE-REF>
    </PROCESS-IN-MACHINE-STATE-IREF>
  </FUNCTION-GROUP-STATE>
  
  <FUNCTION-GROUP-STATE>
    <SHORT-NAME>Active</SHORT-NAME>
    <PROCESS-IN-MACHINE-STATE-IREF>
      <BASE-REF>/Apps/SensorFusion</BASE-REF>
    </PROCESS-IN-MACHINE-STATE-IREF>
    <PROCESS-IN-MACHINE-STATE-IREF>
      <BASE-REF>/Apps/PathPlanning</BASE-REF>
    </PROCESS-IN-MACHINE-STATE-IREF>
    <PROCESS-IN-MACHINE-STATE-IREF>
      <BASE-REF>/Apps/ActuatorControl</BASE-REF>
    </PROCESS-IN-MACHINE-STATE-IREF>
  </FUNCTION-GROUP-STATE>
</FUNCTION-GROUP>

ara::sm Client API

Applications interact with SM via ara::sm:

#include <ara/sm/state_client.h>

// Request a state change (application can request SM to change state)
ara::sm::StateClient state_client;

// Request that ADASFunctionGroup transitions to Active
auto future = state_client.RequestStateTransition(
    "ADASFunctionGroup",
    "Active"
);

auto result = future.get();
if (result.HasValue()) {
    logger.LogInfo() << "ADAS state transition accepted";
} else {
    logger.LogError() << "State transition rejected: " << result.Error().Message();
}

// Subscribe to state change notifications
state_client.SubscribeToStateChange(
    "ADASFunctionGroup",
    [](const std::string& fg_name, const std::string& new_state) {
        logger.LogInfo() << fg_name << " is now in state: " << new_state;
    }
);

SM ↔ EM ↔ PHM Interaction

  Application crash
       │
       ▼
  EM detects process died unexpectedly
       │
       ├───► Attempt restart (if restart policy configured)
       │         │
       │         └─► Max retries exceeded → Escalate to SM
       │
       └───► Notify PHM of supervised entity failure
                   │
                   ▼
             PHM evaluates recovery action:
             ─ RecoveryToDefaultState → SM transitions FunctionGroup to Off → SM transitions to Error machine state
             ─ ResetMachine → SM triggers Restart
             ─ NotifyApplication → SM notifies a designated watchdog AA

EM Privilege Model

EM runs with elevated privileges to:

Set SCHED_FIFO priorities (requires CAP_SYS_NICE)
Set CPU affinity (CAP_SYS_NICE)
Create cgroups for resource isolation (cgroup filesystem access)
Start processes with specific UIDs/GIDs from the Execution Manifest

All AAs run with reduced privileges (non-root, applied capabilities only). This is enforced via:

Linux capabilities (fine-grained privilege control)
User/group namespaces
cgroup v2 for CPU and memory budget enforcement
seccomp filters (optional, restrict available syscalls per process)

Execution Manifest vs Application Manifest vs Service Instance Manifest

Manifest	Owner	Contains
Application Manifest	AA developer	Software component topology, version, categories
Execution Manifest	System integrator	Process→FunctionGroup binding, scheduling, resources
Service Instance Manifest	System integrator	Transport binding config (SOME/IP service IDs, DDS topic mapping)

AI

VAD

ASR

TTS

llama-swap

llama.cpp

Embedded Sytems

EDK2-UEFI

U-Boot

Yocto

QEMU

QNX

AUTOSAR Adaptive

MISRA C++

ASIL

ASPICE

DevOps

Conan

Artifactory

Jenkins