API Reference

Endpoint

POST /vitallens-v3/file

Base URL: https://api.rouast.com
Content-Type: application/json

Request

Headers

Header	Value	Description
`x-api-key`	string	Required. Your unique API key.
`Content-Type`	`application/json`	Required.

Body Parameters

The body must be a JSON object.

Parameter	Type	Required	Description
`video`	string	Yes	Base64-encoded string of raw RGB24 video bytes. • Resolution: 40x40 • Shape: `(frames, 40, 40, 3)`
`fps`	float	Conditional	Frames per second (e.g., `30.0`). Required if `process_signals` is `true`. Used to detrend signals and calculate rates.
`process_signals`	boolean	No	Default: `false`. • `true`: Calculates global vitals (HR, RR, HRV) based on video duration. • `false`: Returns raw waveforms only.
`state`	string	No	Base64-encoded float32 string representing the LSTM internal state. Used for streaming long videos in chunks (see below).
`model`	string	No	Specific model version to use. If omitted, the API selects the best model available for your plan.

Available Models

Capabilities depend on the model version (configurable via model parameter).

Model	Capabilities	Notes
`vitallens-2.0`	HR, RR, HRV (SDNN, RMSSD, LF/HF)	Recommended. Best accuracy.
`vitallens-1.1`	HR, RR	Legacy support.
`vitallens-1.0`	HR, RR	Legacy support.

HTTP Status Codes

The API uses standard HTTP status codes to indicate the success or failure of the request parsing.

Code	Meaning	Common Causes
`200`	OK	Processing successful. Note: This does not guarantee that a valid face was found or that the vital signs are reliable. You must check the `processing_status` object in the response body to validate the result.
`400`	Bad Request	Missing `fps` when `process_signals=true`, or `video` is too short/long or entirely missing.
`403`	Forbidden	Invalid API Key, or plan does not support the requested `model`.
`422`	Unprocessable	Video Formatting Error. The byte string length does not match `frames * 40 * 40 * 3`.
`429`	Quota Exceeded	You have run out of frame credits for your billing cycle.
`500`	Server Error	Unexpected internal error.

Response Structure

The API returns a JSON object containing estimates, confidence scores, and processing metadata.

Example Response

Actual values from quickstart example:

{
  "vital_signs": {
    "ppg_waveform": {
      "data": [0.2094, 0.2895, 0.3912, "..."],
      "unit": "unitless",
      "confidence": [1.0, 1.0, 1.0, "..."],
      "note": "Processed estimate of the PPG waveform..."
    },
    "respiratory_waveform": {
      "data": [-0.2876, -0.2961, -0.2975, "..."],
      "unit": "unitless",
      "confidence": [0.9934, 0.9819, 0.9959, "..."],
      "note": "Processed estimate of the respiratory waveform..."
    },
    "heart_rate": {
      "value": 75.5167,
      "unit": "bpm",
      "confidence": 0.8885,
      "note": "Global estimate of heart rate..."
    },
    "respiratory_rate": {
      "value": 8.0018,
      "unit": "bpm",
      "confidence": 0.9681,
      "note": "Global estimate of respiratory rate..."
    },
    "hrv_sdnn": {
      "value": null,
      "unit": "ms",
      "confidence": null,
      "note": "Did not estimate HRV (SDNN) because provided video was too short..."
    },
    "hrv_rmssd": {
      "value": null,
      "unit": "ms",
      "confidence": null,
      "note": "Did not estimate HRV (RMSSD) because provided video was too short..."
    },
  },
  "processing_status": {
    "face_detected": true,
    "avg_face_confidence": 0.9773,
    "signal_quality": "optimal",
    "issues": []
  },
  "face": {
    "confidence": [0.8499, 0.9562, 0.9536, "..."],
    "note": "Confidence whether a live face is present in the provided video."
  },
  "state": {
    "data": "...",
    "note": "Provide in the next call if continuing with the same video."
  },
  "model_used": "vitallens-2.0",
  "message": "The provided values are estimates..."
}

Data Availability

The API returns different data points depending on the video length and whether process_signals is enabled.

Vital Sign	JSON Key	Type	Required Duration	Returned If
PPG Waveform	`ppg_waveform`	Continuous waveform	N/A (Always)	Always
Respiratory Waveform	`respiratory_waveform`	Continuous waveform	N/A (Always)	Always
Heart Rate	`heart_rate`	Global value	≥ 5s	`process_signals=true`
Respiratory Rate	`respiratory_rate`	Global value	≥ 10s	`process_signals=true`
HRV (SDNN)	`hrv_sdnn`	Global value	≥ 20s	`process_signals=true` & Model 2.0
HRV (RMSSD)	`hrv_rmssd`	Global value	≥ 20s	`process_signals=true` & Model 2.0
HRV (LF/HF)	`hrv_lfhf`	Global value	≥ 55s	`process_signals=true` & Model 2.0

Null Values

If the video meets the Required Duration but the signal quality is too low (e.g., excessive motion), these values will be returned as null or NaN. Always check the processing_status object.

Processing Status & Quality

When process_signals=true, the response includes a processing_status object. You should check this before displaying any vital signs to the user.

face_detected: true if average face confidence > 50%. If false, all vitals are invalid.
signal_quality:
- optimal: High confidence in both PPG and Respiratory signals.
- suboptimal: Face detected, but one signal is weak (e.g., good HR but noisy RR). Check issues.
- low: Face detected, but overall signal quality is too low to be reliable.
- unusable: No face detected or extreme noise. This is often an indication that preprocessing is faulty (see Preprocessing Guide).
issues: Array of specific warnings, e.g., ["no_face_detected"], ["low_ppg_quality"], ["low_respiratory_quality"].

State Management (Streaming)

The VitalLens model uses a stateful architecture that maintains temporal context across frames. To process a long video (e.g., 60 seconds) in small chunks (e.g., 1 second at a time) without losing this context:

Request 1: Send the first chunk of video.
Response 1: API returns a state string.
Request 2: Send the second chunk of video AND the state string from Response 1.
Repeat: This mimics a continuous video stream.

Constraint

For stateless requests (no state provided), the video must be at least 16 frames. For stateful requests (with state), the video chunk can be as small as 5 frames.