# BloxOS Software Architecture

## 1. Operating Principle
BloxOS must treat hardware topology as dynamic, but only within a controlled and testable policy envelope. The OS is responsible for:

- discovering modules
- validating compatibility
- sequencing power and driver activation
- exposing capability changes to applications
- degrading safely when modules disappear or fault

## 2. Delivery Strategy
BloxOS should evolve in stages instead of starting as a ground-up mobile OS.

### Stage 1
Linux-based platform layer with:

- management daemon
- module registry
- userspace services for discovery and policy
- minimal UI for diagnostics and developer workflows

### Stage 2
Mobile UX shell and stable application-facing SDK.

### Stage 3
Optional deeper kernel specialization after real hardware behavior is understood.

This is a better risk posture than committing early to a microkernel rewrite.

## 3. Core Runtime Components

### 3.1 Management Controller Interface
A privileged system service communicates with the always-on MCU to receive:

- seat and lock events
- electrical readiness
- module identity
- fault and thermal telemetry

The MCU is the first trust boundary. The OS should not assume a newly inserted module is safe until the controller says so.

### 3.2 Module Registry
The registry stores the active device graph and the descriptor for each module:

- identity and class
- power budget
- protocol version
- dependency graph
- health state

The registry is the single source of truth for the rest of the system.

### 3.3 Policy Engine
The policy engine decides whether a module is:

1. admitted and activated
2. admitted with restrictions
3. rejected and isolated

Example policy checks:

- power budget available
- protocol compatibility
- required dependencies present
- security verification passed

### 3.4 Driver Host Layer
Prefer userspace driver processes where practical. Required properties:

- crash isolation
- restartable services
- strict timeout handling on insertion and removal
- capability publication only after successful initialization

### 3.5 Capability Bus
Applications should subscribe to high-level capabilities, not raw hardware churn. For example:

- `camera.available`
- `battery.swap_state`
- `io.gamepad.connected`

This prevents applications from depending on fragile device-specific logic.

## 4. Module State Machine
Every module must move through a strict lifecycle:

1. `Detected`
2. `Seated`
3. `ElectricallyQualified`
4. `Authenticated`
5. `Enumerated`
6. `DriverBound`
7. `Active`
8. `Degraded` or `Faulted`
9. `Removed`

Skipping states is not allowed. This is essential for reproducibility and debugging.

## 5. Hot-Swap Behavior

### 5.1 Battery Modules
Supported in MVP with:

- power budget recalculation
- non-critical service throttling during swap
- explicit degraded-power UI state

### 5.2 Camera and Accessory Modules
Supported if:

- active sessions can be stopped cleanly
- app capability changes are broadcast deterministically
- removal never crashes the system compositor or media stack

### 5.3 Compute Modules
Not supported as live swap in MVP. The software path should support replacement only through reboot or service workflow.

## 6. Security Model
- root of trust anchored in the core controller and secure element
- signed firmware for privileged modules
- attestation or cryptographic challenge for trusted module classes
- least-privilege driver processes
- hard isolation for unknown or electrically suspicious modules

## 7. SDK Direction
Expose a narrow stable API:

```ts
type ModuleInfo = {
  id: string;
  class: "power" | "compute" | "camera" | "connectivity" | "io" | "sensor";
  state: "active" | "degraded" | "faulted";
  capabilities: string[];
};

interface Blox {
  getModules(): Promise<ModuleInfo[]>;
  onCapabilityChange(cb: (capability: string) => void): () => void;
  requestModuleLock(moduleId: string, lock: boolean): Promise<boolean>;
}
```

Applications should not directly control power sequencing or unsafe module actions.

## 8. Observability
The platform requires deep diagnostics from day one:

- hardware event timeline
- per-module fault counters
- insertion and removal reason codes
- brownout and thermal incidents
- policy decision logs

Without this, bring-up and field debugging will stall.
