List_chunks() doesn’t return page number — how to get page_identifier?

Fernando_Zanutto · October 20, 2025, 10:25pm

Hi everyone,

I’m using the list_chunks method from the Discovery Engine API (Gemini Enterprise Plus), but it doesn’t return the page number (page_identifier) for each chunk.

I need to know which page each chunk belongs to because I’m summarizing a large number of PDFs through the API.

I’m using the API instead of the Gemini Enterprise chat interface because the chat often fails to find all the content from all PDFs — I need to summarize using the complete text of every document, not just the most relevant parts.

Is there any way to retrieve the page number when listing chunks, or a configuration that enables this?

Thanks in advance!

    chunk_client = de_v1alpha.ChunkServiceClient()
    all_chunks = []
    request = de_v1alpha.ListChunksRequest(parent=document_name, page_size=1000)
    page_result = chunk_client.list_chunks(request=request)
    for chunk in page_result:
        all_chunks.append(chunk)
    return all_chunks

Topic		Replies	Views
Retrieval tool not called by the GenerativeModel Custom ML & MLOps gemini-in-looker , vertex-ai-platform	3	94	December 4, 2024
Making API Calls to No-code Agents (Agent Designer) in Gemini Enterprise (Agentspace) Generative AI & Foundational Models gemini	0	160	November 21, 2025
Large Pdfs on Vertex Ai Generative Model Custom ML & MLOps vertex-ai-platform , vertex-ai-workbench	1	63	September 11, 2024

List_chunks() doesn’t return page number — how to get page_identifier?

AI Suggested topics