Execution Management & State Management
About 1421 wordsAbout 5 min
2026-03-25
Overview
Two Functional Clusters jointly govern the lifecycle of all software on an AUTOSAR Adaptive machine:
| Cluster | Short Name | Responsibility |
|---|---|---|
| Execution Management | EM | Start, monitor, and terminate individual Adaptive Applications (POSIX processes) |
| State Management | SM | Orchestrate system-level state transitions that determine which applications are running |
They work in a hierarchical relationship: SM decides what should run → EM makes it happen.
Execution Management (EM) in Depth
Core Responsibility
EM is the first AUTOSAR Adaptive service to start after the OS boots. It reads the Execution Manifest of every deployed Adaptive Application and:
- Determines which processes to launch during startup
- Forks and executes each process (
fork()+exec()) - Monitors process health
- Applies recovery actions on process failure
- Terminates processes gracefully during shutdown
Process Model
Each Adaptive Application (AA) runs as a separate POSIX process. This is a fundamental architectural principle:
Machine
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Execution Management Process (PID 2, elevated privilege)
│
├── AA: RadarProcessing (PID 101, /opt/radar/bin/radar_proc)
├── AA: PathPlanning (PID 102, /opt/path/bin/path_plan)
├── AA: DiagnosticsApp (PID 103, /opt/diag/bin/diag_app)
└── AA: OTAManager (PID 104, /opt/ota/bin/ota_mgr)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
AUTOSAR Adaptive Platform (Middleware)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
POSIX OS (Linux / QNX)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━Process isolation means:
- Memory faults in one AA cannot corrupt another AA's memory
- EM can kill and restart a misbehaving AA without affecting the platform
- Each AA runs with minimum required OS permissions (least privilege principle)
Application Process Lifecycle State Machine
┌─────────────┐
Start() │ │
─────────────► Initializing│
│ │
└──────┬──────┘
│ ReportApplicationState(kRunning)
▼
┌─────────────┐
│ │
│ Running │◄─────────────────────────┐
│ │ │
└──────┬──────┘ Restart (if recovery
│ policy = restart)
SIGTERM │ ReportApplicationState(kTerminating)
received │ or EM requests shutdown
▼
┌─────────────┐
│ │
│ Terminating │
│ │
└──────┬──────┘
│ process exits (exit(0))
▼
┌─────────────┐
│ Terminated │ EM records exit, decides next action
└─────────────┘Application Responsibility: Applications must call ara::exec::ApplicationClient::ReportApplicationState() to inform EM of their state transitions. EM waits for each state report within a configured timeout.
#include <ara/exec/application_client.h>
int main() {
ara::exec::ApplicationClient app_client;
// --- Initialization Phase ---
// Set up threads, connect to services, load config
InitializeHardware();
LoadCalibrationData();
ConnectToServices();
// Signal to EM that we're ready
app_client.ReportApplicationState(ara::exec::ApplicationState::kRunning);
// --- Running Phase ---
while (!shutdown_requested) {
DoWork();
// Check for shutdown signal
if (ShouldShutdown()) {
break;
}
}
// --- Termination Phase ---
app_client.ReportApplicationState(ara::exec::ApplicationState::kTerminating);
// Clean up resources — must complete before OS terminates the process
CleanupHardware();
FlushLogs();
return 0; // EM observes clean exit
}EM Timeout Handling
EM enforces timeouts on each state transition:
EM action What happens on timeout
─────────────────────────────────────────────────────────────────────
Wait for kRunning report EM marks AA as failed; applies recovery
Wait for kTerminating report After SIGTERM + timeout → sends SIGKILL
Wait for process exit After SIGKILL + timeout → marks as failedTimeouts are configured in the Execution Manifest:
<PROCESS>
<SHORT-NAME>RadarProcessingMain</SHORT-NAME>
<STARTUP-PROCESS-TIMEOUT>5000</STARTUP-PROCESS-TIMEOUT> <!-- ms -->
<SHUTDOWN-PROCESS-TIMEOUT>3000</SHUTDOWN-PROCESS-TIMEOUT>
</PROCESS>Execution Manifest — Full Structure
The Execution Manifest is an ARXML file that EM reads to configure each process:
<ADAPTIVE-APPLICATION>
<SHORT-NAME>RadarProcessingApp</SHORT-NAME>
<CATEGORY>APPLICATION</CATEGORY>
<PROCESS>
<SHORT-NAME>RadarProcessingMain</SHORT-NAME>
<!-- Binary to execute -->
<EXECUTABLE>
<SHORT-NAME>radar_proc</SHORT-NAME>
<CODE-DESCRIPTOR>/opt/radar/bin/radar_proc</CODE-DESCRIPTOR>
</EXECUTABLE>
<!-- Command-line arguments -->
<START-UP-OPTION>
<SHORT-NAME>ConfigFile</SHORT-NAME>
<OPTION-VALUE>--config=/etc/radar/radar.json</OPTION-VALUE>
</START-UP-OPTION>
<!-- Environment variables -->
<ENV-VAR>
<SHORT-NAME>LOG_LEVEL</SHORT-NAME>
<VALUE>DEBUG</VALUE>
</ENV-VAR>
<!-- Scheduling configuration -->
<SCHEDULING-POLICY>SCHED_FIFO</SCHEDULING-POLICY>
<SCHEDULING-PRIORITY>60</SCHEDULING-PRIORITY> <!-- 0=lowest, 99=highest for FIFO -->
<CPU-AFFINITY>
<CPU-CORE-ID>2</CPU-CORE-ID>
<CPU-CORE-ID>3</CPU-CORE-ID>
</CPU-AFFINITY>
<!-- Resource limits -->
<RESOURCE-GROUP>
<SHORT-NAME>RadarResourceGroup</SHORT-NAME>
<CPU-BUDGET>30</CPU-BUDGET> <!-- 30% CPU budget -->
<MEMORY-LIMIT>512000000</MEMORY-LIMIT> <!-- 512 MB -->
<MAX-FD-LIMIT>256</MAX-FD-LIMIT>
</RESOURCE-GROUP>
<!-- Startup timeout -->
<STARTUP-PROCESS-TIMEOUT>5000</STARTUP-PROCESS-TIMEOUT>
<SHUTDOWN-PROCESS-TIMEOUT>3000</SHUTDOWN-PROCESS-TIMEOUT>
<!-- Function Group association -->
<FUNCTION-GROUP-REF>/FunctionGroups/MachineFG</FUNCTION-GROUP-REF>
<FUNCTION-GROUP-STATE-REF>/FunctionGroups/MachineFG/Running</FUNCTION-GROUP-STATE-REF>
</PROCESS>
</ADAPTIVE-APPLICATION>Recovery Actions
When a process fails (unexpected termination, crash, or timeout), EM applies a configured recovery action:
Recovery Policy Behavior
──────────────────────────────────────────────────────────────────────────
NO_RESTART Process is not restarted; EM notifies PHM and SM
RESTART_PROCESS EM immediately restarts the process; limited retries
RESTART_WITH_BACKOFF Restart with increasing delay (1s, 2s, 4s, 8s...)
ESCALATE_TO_SM EM notifies SM, which may trigger a state change
Configured in Execution Manifest:
<RECOVERY-ACTION>
<RECOVERY-POLICY>RESTART_WITH_BACKOFF</RECOVERY-POLICY>
<MAX-RESTART-ATTEMPTS>3</MAX-RESTART-ATTEMPTS>
<RESTART-BACKOFF-MS>1000</RESTART-BACKOFF-MS>
</RECOVERY-ACTION>After exceeding max restart attempts, EM escalates to SM or enters a machine-level error state.
State Management (SM) in Depth
Role of SM
SM is the system-level orchestrator. It manages the high-level operational state of the entire Adaptive machine. SM:
- Defines the Machine State (the overarching system mode)
- Manages Function Groups (logical groups of related applications)
- Requests EM to start/stop specific process groups based on state transitions
- Handles recovery escalation from EM/PHM
Machine State
The Machine State represents the top-level operating mode of the vehicle ECU:
┌─────────────┐
│ Startup │ Entered immediately at OS boot
└──────┬──────┘
│ (platform services initialized)
▼
┌───────────────────────────────┐
│ Driving Mode │ Normal vehicle operation
│ (Function Groups: All apps) │
└─────────────┬─────────────────┘
┌────────────────┤
│ OTA trigger │ remote diagnostics trigger
▼ ▼
┌──────────────┐ ┌──────────────────────────────┐
│ Update Mode │ │ Diagnostic Mode │
│ (Only UCM, │ │ (Full stack + Diag server) │
│ OTA apps) │ │ │
└──────┬───────┘ └──────────────────────────────┘
│ (update complete)
▼
┌──────────────┐
│ Restart │ Reboots the machine; returns to Startup on next boot
└──────────────┘
┌──────────────┐
│ Shutdown │ Graceful system shutdown; EM terminates all processes
└──────────────┘SM standard states (from AUTOSAR specification):
| State | Description |
|---|---|
Startup | Initial state; platform services start; no user-facing functionality |
Driving | Full vehicle operation; all AAs active |
Parking | Reduced set of AAs; lower power; park assist features active |
OTA_Update | Only UCM and Update-related AAs active |
Vehicle_Service | Workshop diagnostic mode |
Shutdown | Graceful stop of all processes → machine powers down |
Restart | Graceful stop → reboot |
Error | Unrecoverable error; limited function safe state |
Function Groups and Function Group States
A Function Group is a logical grouping of related Adaptive Applications that are controlled as a unit. Every Adaptive Application belongs to at least one Function Group.
Machine
├── Function Group: MachineFG (mandatory — represents the machine itself)
│ ├── State: Off → No processes running
│ ├── State: Startup → Platform services only
│ └── State: Running → All normal-mode processes
│
├── Function Group: ADASFunctionGroup
│ ├── State: Off → No ADAS processes
│ ├── State: Passive → Monitoring only, no actuation
│ └── State: Active → Full ADAS pipeline running
│
├── Function Group: OTAFunctionGroup
│ ├── State: Off → UCM idle
│ └── State: Updating → UCM + download agent active
│
└── Function Group: DiagFunctionGroup
├── State: Off → Diag server stopped
└── State: Active → UDS Diagnostic server activeState transitions are expressed in ARXML:
<FUNCTION-GROUP>
<SHORT-NAME>ADASFunctionGroup</SHORT-NAME>
<FUNCTION-GROUP-STATE>
<SHORT-NAME>Off</SHORT-NAME>
<!-- No processes in Off state -->
</FUNCTION-GROUP-STATE>
<FUNCTION-GROUP-STATE>
<SHORT-NAME>Passive</SHORT-NAME>
<PROCESS-IN-MACHINE-STATE-IREF>
<BASE-REF>/Apps/SensorFusion</BASE-REF>
</PROCESS-IN-MACHINE-STATE-IREF>
</FUNCTION-GROUP-STATE>
<FUNCTION-GROUP-STATE>
<SHORT-NAME>Active</SHORT-NAME>
<PROCESS-IN-MACHINE-STATE-IREF>
<BASE-REF>/Apps/SensorFusion</BASE-REF>
</PROCESS-IN-MACHINE-STATE-IREF>
<PROCESS-IN-MACHINE-STATE-IREF>
<BASE-REF>/Apps/PathPlanning</BASE-REF>
</PROCESS-IN-MACHINE-STATE-IREF>
<PROCESS-IN-MACHINE-STATE-IREF>
<BASE-REF>/Apps/ActuatorControl</BASE-REF>
</PROCESS-IN-MACHINE-STATE-IREF>
</FUNCTION-GROUP-STATE>
</FUNCTION-GROUP>ara::sm Client API
Applications interact with SM via ara::sm:
#include <ara/sm/state_client.h>
// Request a state change (application can request SM to change state)
ara::sm::StateClient state_client;
// Request that ADASFunctionGroup transitions to Active
auto future = state_client.RequestStateTransition(
"ADASFunctionGroup",
"Active"
);
auto result = future.get();
if (result.HasValue()) {
logger.LogInfo() << "ADAS state transition accepted";
} else {
logger.LogError() << "State transition rejected: " << result.Error().Message();
}
// Subscribe to state change notifications
state_client.SubscribeToStateChange(
"ADASFunctionGroup",
[](const std::string& fg_name, const std::string& new_state) {
logger.LogInfo() << fg_name << " is now in state: " << new_state;
}
);SM ↔ EM ↔ PHM Interaction
Application crash
│
▼
EM detects process died unexpectedly
│
├───► Attempt restart (if restart policy configured)
│ │
│ └─► Max retries exceeded → Escalate to SM
│
└───► Notify PHM of supervised entity failure
│
▼
PHM evaluates recovery action:
─ RecoveryToDefaultState → SM transitions FunctionGroup to Off → SM transitions to Error machine state
─ ResetMachine → SM triggers Restart
─ NotifyApplication → SM notifies a designated watchdog AAEM Privilege Model
EM runs with elevated privileges to:
- Set SCHED_FIFO priorities (requires
CAP_SYS_NICE) - Set CPU affinity (
CAP_SYS_NICE) - Create cgroups for resource isolation (
cgroupfilesystem access) - Start processes with specific UIDs/GIDs from the Execution Manifest
All AAs run with reduced privileges (non-root, applied capabilities only). This is enforced via:
- Linux capabilities (fine-grained privilege control)
- User/group namespaces
- cgroup v2 for CPU and memory budget enforcement
- seccomp filters (optional, restrict available syscalls per process)
Execution Manifest vs Application Manifest vs Service Instance Manifest
| Manifest | Owner | Contains |
|---|---|---|
| Application Manifest | AA developer | Software component topology, version, categories |
| Execution Manifest | System integrator | Process→FunctionGroup binding, scheduling, resources |
| Service Instance Manifest | System integrator | Transport binding config (SOME/IP service IDs, DDS topic mapping) |