Hi everyone,
I’m using the list_chunks method from the Discovery Engine API (Gemini Enterprise Plus), but it doesn’t return the page number (page_identifier) for each chunk.
I need to know which page each chunk belongs to because I’m summarizing a large number of PDFs through the API.
I’m using the API instead of the Gemini Enterprise chat interface because the chat often fails to find all the content from all PDFs — I need to summarize using the complete text of every document, not just the most relevant parts.
Is there any way to retrieve the page number when listing chunks, or a configuration that enables this?
Thanks in advance!
chunk_client = de_v1alpha.ChunkServiceClient()
all_chunks = []
request = de_v1alpha.ListChunksRequest(parent=document_name, page_size=1000)
page_result = chunk_client.list_chunks(request=request)
for chunk in page_result:
all_chunks.append(chunk)
return all_chunks