This automation relies on two distinct processing branches depending on the file type (image or PDF), using AI actions adapted to each case.
→ In the image branch, only one step is required: you use the Ask AI action with either a public cloud AIor private & secure AI source, selecting a model capable of reading files OCR (Optical Character Recognition) protocol.. The AI directly analyzes the image (JPEG/PNG) and generates a structured JSON, which is then mapped to the destination fields.
→ In the PDF branch, the process involves two steps. First, the Process a document with Mistral AI OCR action extracts the content of the PDF and converts it into structured text (usable JSON). Then, a second Ask AI action (public cloud or private & secure) is used to interpret this JSON and perform the mapping to the table fields.
In summary:
PDF flow → 2 steps
PDF → OCR action → text field → AI → JSON → mapping to destination
fields
Image flow → 1 step
JPEG/PNG image → direct AI analysis → structured JSON → mapping to
destination fields
Before configuring your prompts, review these rules: failing to follow them will prevent automatic mapping to your destination fields.
Simple, consistent keys
No accents, no spaces, no special characters.
Recommended format: camelCase or snake_case.
Exact correspondence between prompt in the question field and the mapping in the destination fields
Key names must be strictly identical between the keys defined in
the
Prompt field and the keys entered in the
JSON Properties mapping configuration.
Any discrepancy will prevent automatic data assignment.
Empty value when in doubt
Explicitly add to your prompt:
"If in doubt, leave the value empty."
This rule reduces interpretation errors and improves mapping stability.
Step 1 → PDF: extract text via OCR
Use the "Process a document with Mistral AI OCR" action to convert the PDF content into usable JSON text.
1. API key
Enter your Mistral API key to activate the OCR action.
2. Source file
Select the PDF stored in an attachment field in your notebook.
3. Destination field
Store the result in a text field — this JSON text
will be used as input for the next step.
💡 Result: the PDF content is converted to JSON text and stored in your field, ready to be processed by the AI action in the next step.
Step 2 → PDF: structure OCR text with AI
Once the OCR text is available in a field, use the Ask an AI action to extract key information and write it to your destination fields.
1. AI source
Select the source suited to your context:
Private & Secure AI · Public Cloud AI · Dedicated Server
2. Model
Choose a model capable of analyzing long text and generating a reliable
structured JSON · for example GPT-OSS 20B.
Frames the extraction, enforces a structured format, and avoids narrative
responses.
You are an expert OCR system capable of extracting information from expense receipts.
4. Prompt
The prompt must include: context, the OCR field variable, the task,
the expected JSON keys, and output rules.
Input:
The text below is the OCR content extracted from an expense receipt.
OCR text:
$Extract_JSON_text_Pdf
Task:
Analyze only this text and extract reliable expense information.
Output:
Return ONLY ONE JSON object on a single line with EXACTLY the following keys:
description, details, total_incl_tax, subtotal_excl_tax, date_time, category
Key rules:
- description: name the expense in two words
- details: summarize the context in five words
- total_incl_tax: total amount paid in euros (example: 45.50)
- subtotal_excl_tax: amount before VAT, always lower than the total
- date_time: billing date and time formatted as DD-MM-YYYY HH:mm
- category: select ONE category from ⛽ Fuel, 🅿️ Parking, 🚧 Toll, 🍽️ Lunch, 🍷 Dinner, 🏨 Hotel, ☕ Coffee / Drink, 🚕 Taxi, 🚆 Train, ✈️ Flight, 🚇 Metro / Bus / RER, 🎟️ Invitation, 🎁 Gift, 🚗 Mileage, ✏️ Supplies, 🎓 Training, 🏡💻 Remote work allowance, Other
General rules:
1. Analyze only the visible content of the receipt.
2. Extract only information explicitly shown on the receipt.
3. Do not infer, estimate, or calculate missing values.
4. If in doubt, leave the value empty ("").
5. Perform a second verification pass before producing the answer.
6. Output must be valid JSON returned on a single line.
7. Do not add comments, explanations, or additional keys.
Prompt structure: context → OCR variable ($text_field)
→ task → JSON keys with rules → general output rules (valid JSON,
single line, no comments).
5. Response format
Select JSON with destination fields.
Each JSON property will be automatically injected into the corresponding
field.
6. JSON property mapping
Map each JSON key to the corresponding destination field:
JSON property
Destination field
description
Description
details
Details
date_time
Date & Time
total_incl_tax
Total (incl. tax)
subtotal_excl_tax
Subtotal (excl. tax)
category
Category
Step → Image (.png / .jpeg): direct AI analysis
For image files, no intermediate OCR step is required. The AI analyzes the attachment directly and generates a structured JSON output.
1. AI source
Select the source suited to your context:
Private & Secure AI · Public Cloud AI · Dedicated Server
2. Model
Choose a model that supports image analysis (built-in vision / OCR)
with the attachment field enabled in the configuration:
You are an expert OCR system capable of extracting information from expense receipts.
4. Prompt
Input:
The attached file is an image of an expense receipt.
Task:
Analyze the image and extract ONLY the reliable expense information visible on the receipt.
Output:
Return ONLY ONE JSON object on a single line with EXACTLY the following keys:
description, details, total_incl_tax, subtotal_excl_tax, date_time, category
Key rules:
- description: name the expense in two words
- details: summarize the context in five words
- total_incl_tax: total amount paid in euros (example: 45.50)
- subtotal_excl_tax: amount before VAT, always lower than the total
- date_time: billing date and time formatted as DD-MM-YYYY HH:mm
- category: select ONE category from ⛽ Fuel, 🅿️ Parking, 🚧 Toll, 🍽️ Lunch, 🍷 Dinner, 🏨 Hotel, ☕ Coffee / Drink, 🚕 Taxi, 🚆 Train, ✈️ Flight, 🚇 Metro / Bus / RER, 🎟️ Invitation, 🎁 Gift, 🚗 Mileage, ✏️ Supplies, 🎓 Training, 🏡💻 Remote work allowance, Other
General rules:
1. Analyze only the visible content of the receipt.
2. Extract only information explicitly shown on the receipt.
3. Do not infer, estimate, or calculate missing values.
4. If in doubt, leave the value empty ("").
5. Perform a second verification pass before producing the answer.
6. Output must be valid JSON returned on a single line.
7. Do not add comments, explanations, or additional keys.
Prompt structure: context (image file) → task →
JSON keys
with rules → general output rules.
No OCR variable here — the model analyzes the attachment directly.
5. Attachment
Select the attachment field containing the image to analyze.
This field gives the model direct access to the file without any
intermediate OCR step.
6. Response format
Select JSON with destination fields.
7. JSON property mapping
Same structure as the PDF flow:
JSON property
Destination field
description
Description
details
Details
date_time
Date & Time
total_incl_tax
Total (incl. tax)
subtotal_excl_tax
Subtotal (excl. tax)
category
Category
Methodology: one scenario with conditional .PDF & image branches
For a robust scenario, use a condition that automatically routes processing based on the type of file received.
Trigger
An attachment field is added or updated (PDF or image).
Condition
If the filename contains .pdf → PDF branch.
Otherwise → image branch.
PDF branch
OCR action (PDF → JSON text) → Ask an AI action (text → JSON)
→ mapping to destination fields.
Image branch
Ask an AI action with direct attachment (Mistral Small 3.2 or
Qwen 2.5 VL 72B) → JSON → mapping to destination fields.