Hi everyone,
I’m building a Custom Extractor on Google Document AI to support an iOS app that fills empty PDF forms. I’ve hit a persistent training error and would love some guidance from the DocAI team or anyone who has modeled similar schemas.
Goal
I want the Custom Extractor to:
- Detect each form field on an empty PDF.
- Return:
- the field’s label (“Full Name”, “Date of Birth”, etc.),
- the field type (text / checkbox / signature / date…),
- the value bounding box (where the user writes/signs/checks),
- the label bounding box.
The iOS client already knows how to overlay and fill fields if it gets these bounding boxes, so the Custom Extractor is effectively a layout detector for fields.
Schema I’m using
Custom Extractor schema (Workbench):
-
Parent entity:
FormField- Method:
EXTRACT - Occurrence: Optional multiple
- Description: A single empty or filled input field on a PDF form. Includes its value region and optional label nearby.
- Method:
-
Child entities (properties) of
FormField:label- Data type: Plain text
- Method:
Extract - Occurrence: Required once
- Description: visible label text near the field.
fieldType- Data type: Plain text
- Method:
Derive - Occurrence: Required once
- Description: enum like
text,number,checkbox,signature, etc.
I previously had:
- labelBoundingBox
- valueBoundingBox
as additional child properties (with only geometry in pageAnchor), but those caused dataset validation issues (normalized_vertices missing), so I removed them and moved the geometry to pageAnchor only.
Labeled Document structure
I generate labeled documents programmatically (to import as “pre-labeled documents”) based on my app’s FormTemplate model (labelRect/valueRect).
Each real field becomes exactly one FormField entity:
{
"id": "current_name_last",
"type": "FormField",
"mentionText": "text Current Name-Last",
"pageAnchor": {
"pageRefs": \[
{
"page": 0,
"layoutType": "VISUAL_ELEMENT",
"boundingPoly": {
"normalizedVertices": \[
{ "x": 0.42, "y": 0.19 },
{ "x": 0.60, "y": 0.19 },
{ "x": 0.60, "y": 0.23 },
{ "x": 0.42, "y": 0.23 }
\]
}
}
\]
},
"confidence": 1,
"properties": \[
{
"id": "current_name_last_label",
"type": "label",
"mentionText": "Current Name-Last",
"pageAnchor": {
"pageRefs": \[
{
"page": 0,
"layoutType": "VISUAL_ELEMENT",
"boundingPoly": {
"normalizedVertices": \[
{ "x": 0.10, "y": 0.19 },
{ "x": 0.30, "y": 0.19 },
{ "x": 0.30, "y": 0.23 },
{ "x": 0.10, "y": 0.23 }
\]
}
}
\]
},
"textAnchor": {
"textSegments": \[
{ "startIndex": "169", "endIndex": "187" }
\]
},
"confidence": 1
},
{
"id": "current_name_last_fieldType",
"type": "fieldType",
"mentionText": "text",
"pageAnchor": {
"pageRefs": \[
{
"page": 0,
"layoutType": "VISUAL_ELEMENT",
"boundingPoly": {
"normalizedVertices": \[
{ "x": 0.42, "y": 0.19 },
{ "x": 0.60, "y": 0.19 },
{ "x": 0.60, "y": 0.23 },
{ "x": 0.42, "y": 0.23 }
\]
}
}
\]
},
"textAnchor": {},
"confidence": 1
}
\]
}
Invariants:
- Exactly one FormField per visual field.
- Exactly two children: label and fieldType.
- FormField.pageAnchor = value area.
- label.pageAnchor = label area.
- fieldType.pageAnchor = currently reuses the value box, to satisfy geometry requirements.
- layoutType = “VISUAL_ELEMENT” everywhere.
- All normalizedVertices are in [0, 1] and form a polygon with 4 vertices.
- If label text appears in OCR, label.textAnchor.textSegments points correctly into document.text.
I validate the JSON in my Python script so the only entity types emitted are:
- FormField
- label
- fieldType
No more labelBoundingBox / valueBoundingBox types.
Dataset and environment
- Project: (PII Removed by Staff)
- Location: us
- Processor: projects/(PII Removed by Staff)/locations/us/processors/9fd390ef14183b57
- ~20 labeled docs in the dataset:
- 10 empty templates
- 10 filled versions of those templates
- Split: training 16, test 4
- Label counts in Workbench dataset UI:
- FormField: 1140
- label: 1140
- fieldType: 1140
I successfully trained a different Custom Extractor earlier in the same project and region using the same PDFs but a much simpler schema, so project-level permissions / VPC / metadata server shouldn’t be an issue (but you can never be sure).
Errors
When I first added fieldType as a child without geometry, I got a clear dataset validation error in the operation metadata, e.g.:
Invalid document. field_name: "entities.page_anchor.page_refs.bounding_poly.normalized_vertices"
annotation_name: "FormField/fieldType"
num_fields_needed: "3", num_fields: "0"
I fixed that by giving fieldType.pageAnchor a proper boundingPoly.normalizedVertices (reusing the value bbox). After that, dataset validation errors disappeared.
However, training still fails with error code 13 and this message in Cloud Audit logs:
"status": {
"code": 3,
"message": "Evaluation with ID `cde-harvester-pipeline_0_0` had no global metrics; Failed to compute ProcessorVersion metadata. pv_id = \"af53767fd2d339f9\""
}
"methodName": "google.cloud.documentai.uiv1beta3.DocumentProcessorService.TrainProcessorVersion"
The operation itself looks like:
{
"name": "projects/(PII Removed by Staff)/locations/us/operations/12633571671074354259",
"done": true,
"error": {
"code": 13,
"message": "Internal error encountered.",
"details": []
},
"metadata": {
"@type": "type.googleapis.com/google.cloud.documentai.uiv1beta3.TrainProcessorVersionMetadata",
"commonMetadata": {
"state": "FAILED",
"createTime": "2025-11-15T13:41:34.357306Z",
"updateTime": "2025-11-15T13:47:40.973766Z",
"resource": "projects/(PII Removed by Staff)/locations/us/processors/9fd390ef14183b57/processorVersions/af53767fd2d339f9"
},
"trainingDatasetValidation": {},
"testDatasetValidation": {}
}
}
So the training pipeline runs far enough to do evaluation, but:
“Evaluation … had no global metrics; Failed to compute ProcessorVersion metadata.”
This sounds like the evaluator couldn’t compute F1/precision/recall for any label type in the test split.
What I’ve already tried
- Regenerated all labeled JSON from scratch and re-imported into a fresh dataset.
- Validated via script that:
- Only FormField, label, fieldType entity types exist.
- Every entity and child has pageAnchor.pageRefs[0].boundingPoly.normalizedVertices with 4 points.
- Polygons are normalized and within [0, 1].
- Tried both Template-based and Model-based trainers.
- Tried training with:
- Only empty forms
- Empty + filled pairs
- Confirmed that another Custom Extractor with basic labels can train successfully in the same project/region.
At this point, the only consistent difference is the nested schema design (parent/child) and the way we model geometry vs. text.
Questions for the DocAI team \ community
- What exactly does “Evaluation with ID … had no global metrics” mean for Custom Extractor?
- Does it imply that, for the evaluation split, there are zero valid ground-truth labels that the evaluator considers (e.g., because they’re all nested or derive-only)?
- Or can it also be triggered by internal infra issues?
- Is a parent-child schema (FormField → [label, fieldType]) supported and recommended for this kind of layout task?
- Are evaluation metrics computed only on leaf Extract text fields (like label)?
- Are there constraints around how parent entities with properties participate in training/evaluation?
- Is there a recommended pattern or sample for modeling “field + label + geometry” for empty form fields?
For example:
- Should I avoid nested properties entirely and just have a single flat FormField label with text+geometry?
- Is it acceptable to store extra semantics like fieldType in metadata instead of as a child entity?
- Is there a way to get more detailed evaluation logs for this processor version?
- For example, per-label-type counts of ground-truth vs predicted instances, or whether the evaluator is ignoring certain types.
If it’s helpful, I can provide:
- A minimal JSON example of one labeled document (with a few fields).
- The Python script I’m using to generate the labeled Document JSON from my form templates.
Thanks a lot,
I’m trying to design a clean, reusable dataset format for form templates, but right now I feel like I’m fighting the evaluation pipeline without enough visibility into what it expects.