1. Overview
VitaeFlow is an open standard for embedding structured resume data inside PDF files. It uses the PDF/A-3 associated files standard (ISO 19005-3) to attach a JSON file inside the PDF — the same standard used by Factur-X for electronic invoices, adapted for resume data.
A VitaeFlow PDF is a standard PDF that contains an embedded JSON file named
vitaeflow.json.
The PDF remains fully readable by any viewer. Tools that understand VitaeFlow can extract the structured data; others simply ignore it.
.vf.pdf
The .vf.pdf file extension
is recommended for discoverability but not required. The embedded data is what makes a PDF a VitaeFlow document, not the filename.
2. Schema
The resume data follows a JSON Schema (draft 2020-12) defined in schema.json. The schema is the source of truth — refer to it for the exhaustive list of fields and constraints.
Required top-level fields
| Field | Type | Description |
|---|---|---|
version | string | Schema version, e.g. "0.1". Format: major.minor |
profile | string | Always "standard" |
basics | object | Core identity. Requires givenName, familyName, email |
Optional sections
All other sections are arrays of objects. Include what you have, omit the rest:
workeducationskillslanguagescertificationsprojectspublicationsvolunteerreferencesinterestscustom Shared types
date —
YYYY, YYYY-MM, or YYYY-MM-DD.
Pattern: ^\d4(-(?:0[1-9]|1[0-2])(-(?:0[1-9]|[12]\d|3[01]))?)?$ countryCode —
ISO 3166-1 alpha-2, e.g. "FR", "US".
Pattern: ^[A-Z]2$ fluency —
A1 A2 B1 B2 C1 C2 native bilingual level —
beginner intermediate advanced expert Example
{
"version": "0.1",
"profile": "standard",
"lang": "en",
"basics": {
"givenName": "Marie",
"familyName": "Laurent",
"email": "[email protected]",
"headline": "Lead Developer"
},
"work": [
{
"organization": "TechCorp",
"position": "Lead Developer",
"startDate": "2021-03"
}
]
} 3. PDF embedding
VitaeFlow uses the PDF/A-3 associated files mechanism (ISO 19005-3) to embed the JSON data. This is the same standard mechanism used by Factur-X for electronic invoices.
Constants
vitaeflow.json application/json /Alternative VitaeFlow structured resume data Interactive overview
Explore the PDF internal structure to see where VitaeFlow data lives. Click on any node to expand it and see details.
PDF structure
The JSON file is attached to the PDF using two complementary mechanisms: the EmbeddedFiles name tree (PDF 1.7, ISO 32000-1 §7.11.4) and the AF array (PDF/A-3, ISO 19005-3).
Catalog {
Names: {
EmbeddedFiles: {
Names: [
(vitaeflow.json) <FileSpec ref>
]
}
}
AF: [ <FileSpec ref> ]
Metadata: <XMP stream ref>
} EmbeddedFiles name tree — stores the file reference by name. Name trees can be hierarchical
(with Kids arrays) or flat (with a Names array). Implementations must handle both. The array contains
alternating name/reference pairs: ["vitaeflow.json", <ref>].
AF array — PDF/A-3 requires the file reference to also appear in the catalog-level
AF (Associated Files) array. This enables conforming readers to discover embedded files without traversing the name tree.
FileSpec dictionary
The file is described by a FileSpec dictionary (PDF 2.0 §7.11.3). The JSON content is stored as a compressed stream
in the EF.F entry.
FileSpec Dictionary {
Type: /Filespec
F: (vitaeflow.json)
Desc: (VitaeFlow structured resume data)
AFRelationship: /Alternative
EF: {
F: <stream> % JSON content, UTF-8 encoded
}
} AFRelationship is set to /Alternative, indicating that the embedded file
is an alternative representation of the document content — the same resume in a different format.
Stream encoding — the JSON content must be valid UTF-8. The stream may be compressed using standard PDF filters (typically FlateDecode). Implementations must decompress the stream before parsing.
Embedding steps
- 1 Validate the resume data against the schema in strict mode. Reject if invalid.
- 2 Remove any existing attachment named
vitaeflow.jsonto prevent duplicates. - 3 Create the FileSpec dictionary with the JSON content as a compressed UTF-8 stream.
- 4 Register the FileSpec in the EmbeddedFiles name tree and the catalog AF array.
- 5 Write XMP metadata to the document catalog (see next section).
4. XMP metadata
VitaeFlow writes custom XMP metadata into the PDF's metadata stream. This allows tools to identify VitaeFlow documents and read basic information without extracting and parsing the full JSON attachment.
XMP namespace
urn:vitaeflow:pdfa:resume:1p0# vf Properties
| Property | Value | Source |
|---|---|---|
DocumentType | "RESUME" | Constant |
Version | e.g. "0.1" | From resume.version |
ConformanceLevel | "standard" | From resume.profile |
Generator | e.g. "MyApp/1.0" | From resume.meta.generator or SDK default |
XMP template
If the PDF already contains XMP metadata, merge the VitaeFlow properties into the existing RDF block. If no XMP exists, create a new metadata stream with this structure:
<?xpacket begin="\uFEFF" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:vf="urn:vitaeflow:pdfa:resume:1p0#">
<vf:DocumentType>RESUME</vf:DocumentType>
<vf:Version>0.1</vf:Version>
<vf:ConformanceLevel>standard</vf:ConformanceLevel>
<vf:Generator>MyApp/1.0</vf:Generator>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?> 5. Extraction
Detection
To quickly check if a PDF contains VitaeFlow data without parsing the JSON:
- 1 Navigate to
Catalog → Names → EmbeddedFiles - 2 Traverse the name tree (handle both
NamesandKidsnodes) - 3 Look for an entry named exactly
"vitaeflow.json"(case-sensitive)
Full extraction
- 1 Locate the FileSpec for
vitaeflow.jsonin the EmbeddedFiles name tree - 2 Read the stream from
FileSpec → EF → F - 3 Decompress the stream (handle FlateDecode and other PDF filters)
- 4 Decode bytes as UTF-8 and parse as JSON
- 5 Validate the parsed data in tolerant mode
Error handling
null. The PDF is not a VitaeFlow document.
6. Validation
VitaeFlow defines two validation modes to balance strictness during writing with flexibility during reading.
Strict mode
Used when writing (embedding). Rejects unknown fields at any level via
additionalProperties: false. Ensures the data conforms exactly to the schema.
An implementation must validate in strict mode before embedding.
Tolerant mode
Used when reading (extracting). Strips all additionalProperties: false
constraints, allowing unknown fields to pass through. Enables forward compatibility.
A v0.1 reader can read a v0.2 document without failing on new fields.
Error format
Validation errors use JSON Pointer paths (RFC 6901) and human-readable messages:
{
"valid": false,
"errors": [
{
"path": "/basics",
"message": "Missing required property: email"
},
{
"path": "/work/0",
"message": "Missing required property: startDate"
}
],
"warnings": []
} 7. Versioning
The schema follows an additive-only evolution model:
- New optional fields and enum values may be added in minor versions
- Fields are never removed or renamed
- A reader supporting v1.x can read any v1.0+ document
When a reader encounters a document with a newer schema version than it supports, it should emit a warning but still attempt to extract and return the data. This is the purpose of tolerant mode.
version as major.minor integers.
If the document's version is greater than the implementation's supported version, emit:
"Resume uses schema version X.Y, but this SDK supports A.B. Some fields may not be validated." 8. Implementation guide
To build a VitaeFlow implementation in a new language, your library needs to handle three concerns: PDF manipulation, schema validation, and XMP metadata.
Checklist
PDF operations
- ✓ Read and write PDF 1.7+ files
- ✓ Navigate EmbeddedFiles name trees (flat and hierarchical)
- ✓ Create FileSpec dictionaries with embedded streams
- ✓ Manage the catalog AF array
- ✓ Decompress streams (FlateDecode at minimum)
Schema validation
- ✓ Support JSON Schema 2020-12
- ✓ Compile in strict mode (
additionalProperties: falseenforced) - ✓ Compile in tolerant mode (
additionalProperties: falseremoved) - ✓ Validate
formatkeywords:email,uri - ✓ Return all errors, not just the first
XMP metadata
- ✓ Create RDF/XML with the VitaeFlow namespace
- ✓ Merge into existing XMP if present
- ✓ Handle XML entity escaping (
&,<,>)
Expected behavior
| Function | Behavior |
|---|---|
| embed | Validate strict → remove existing → attach JSON → write XMP → return PDF bytes |
| extract | Find attachment → decompress → parse JSON → validate tolerant → return data + validation |
| detect | Check EmbeddedFiles name tree for "vitaeflow.json" → return boolean |
Reference constants
| Constant | Value |
|---|---|
FILENAME | vitaeflow.json |
MIME_TYPE | application/json |
AF_RELATIONSHIP | /Alternative |
XMP_NAMESPACE | urn:vitaeflow:pdfa:resume:1p0# |
XMP_PREFIX | vf |
SCHEMA_VERSION | 0.1 |