API Vertex AI RAG Engine: Upload file in corpus failed

Hi Team,

I’m playing with AI Platform and the RAG engine. Amazing feature, works well, except for the specific Upload File in corpus API, throw API or even a Java implementation

I’ve created my RAG Corpus. I’ve imported (import , not upload) files throw GCS resources and check my corpus with RAG file list. All sounds good. :blush:

BUT…

Upload RAG file failed. I get a HTTP 200 with en empty response, and the file is not imported. (File list in the corpus stay empty :confused: )

My POST

curl -X POST \
-H "X-Goog-Upload-Protocol: multipart" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-F metadata="{'rag_file': {'display_name':' test-api', 'description':'test-des-api'}}" \
-F file=@./test.txt \
"https://europe-west3-aiplatform.googleapis.com/v1/projects/MYPROJECT/locations/europe-west3/ragCorpora/CORPUSID/ragFiles:upload"

I get the HTTP 200 with body

{
  "ragFile": {}
}

I get the same result throw Java implementation

Other issue, the URL given in the documentation is

https://LOCATION-aiplatform.googleapis.com/upload/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload

and seems wrong . Get a 404 with the link.

Right link seems to be

https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload

Your help will be appreciated to understand what’s wrong in this file upload into my RAG corpus.

Thanks !!!

1 Like

Hi @eric734574,

Welcome to the Google Cloud Community!

You’re facing an issue where your files fail to upload into the Vertex AI RAG Engine corpus via the ‘ragFiles:upload’ API. Although the API returns a successful HTTP 200 response, your uploaded file does not appear in the corpus as expected.

Here are the potential ways that might help with your use case:

  • Content Type: Make sure your ‘Content-Type’ header is set right when you’re sending a multipart request. The ‘-F’ flag in curl usually handles it for you, but it’s good to double-check since it can sometimes act up.
  • Double-Check Region: Ensure europe-west3 is the exact region where you created your RAG Corpus. Mismatched regions in your setup are a common source of 404 errors.
  • Permissions and IAM: Ensure your account linked to ‘gcloud auth print-access-token’ has the necessary IAM permissions, including your ‘roles/aiplatform.ragCorpusUser’ role, to write to your RAG Corpus and use the Vertex AI API.

You may refer to the documentation below, which details the design and functionality of the RAG API:

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

1 Like

Thanks a lot for your help Marvin !

I double check my command and listen to your advices.

  • Content-Type : Seems not compliant with the API. I got an error when set and seems not required in the doc. Only this header is required
-H "X-Goog-Upload-Protocol: multipart"
  • I confirm my region is europe-west3 and my path url is
https://europe-west3-aiplatform.googleapis.com/v1/projects/PROJECTID/locations/europe-west3/ragCorpora/RAGID/ragFiles:upload
  • I’ve got all the privileges on my project, as project owner

I continue to investigate but i’m sill stuck. May be an API only available on US side ?

Thanks again for your help

I’m also stuck on my Java implementation. Any java sample for upload will be amazing. All others parts of my project are working fine : RAG Corpus, Import, List , except Upload :thinking:

I’m having what appears to be the same problem. I create a standalone RAG corpus (with no backing datastore or engine) successfully. Then I cannot upload a file to that corpus via REST. Here is the actual curl command entered at a command line, including the response I get back:

curl -X POST -H “X-Goog-Upload-Protocol: multipart” -H “Authorization: Bearer xxxxxxxxxxxxxxx” -F metadata=“{‘rag_file’: {‘display_name’: ‘test3.md’, ‘description’: ‘this is test 3’ }}” -F file=@/Users/kduffie/test3.md “https://us-central1-aiplatform.googleapis.com/v1/projects/398188094870/locations/us-central1/ragCorpora/3458764513820540928/ragFiles:upload
{
“ragFile”: {}
}

I’ve checked and rechecked all aspects of this. This appears to be completely consistent with the documentation here: RAG Engine API  |  Generative AI on Vertex AI  |  Google Cloud

Can anyone help?