Document AI missing some line-items in similar documents

Hello everyone,

I’m using document AI in order to retrieve some informations from PDF documents. I’ve created a custom extractor processor and I tried the different available versions among the stable versions from 1.3 to 1.5 PRO. I’m using the no-trained based on foundation model version because the PDF can change a lot in the structure.

Anyway the issue is this. I’m using the pretrained-foundation-model-v1.5-pro-2025-06-20 and I have uploaded two very similar PDFs:

  • PDF A: all the fields for the positions (line-item) are extracted (field “Tipo” not in the schema so it’s OK)

  • PDF B: only the fields for one position (line-item) are extracted

The issue persists even if I change the processor version: what it can change is that maybe the problem is inverted (PDF A KO while PDF B OK). I can add that with most documents similar to these the extraction it’s fine across multiple versions.

How can I solve ?

Thanks

Got it looks like your message was cut off. You mentioned:

  • You’re using Document AI with a custom extractor processor

  • You’re using the pretrained foundation model v1.5 PRO

  • You uploaded two similar PDFs

  • In PDF A, all fields are extracted correctly (even though one field isn’t in the schema, which is expected)

But it seems like you were about to describe an issue with PDF B (the second PDF), and maybe explain the difference or error you’re encountering.

Hello Robert,

the issue is that in the PDF B, as you can see from the second image, not all the positions have been extracted but only the third one (and one field from first position).

Thanks

I have a custom extractor trained on a handful of statements and it has the same problem. In some documents, most or all of the table fields are recognized and in other documents it doesn’t recognize a lot of fields, despite all the documents having the exact same layout (with the tables having differing number of table items.) I’m about to re-create the processor and see if training it again helps.