I have been trying to upload data to the RAG Engine Corpus for endless hours in endless ways now and it fails no matter how simple I make it.
The simplest example is attempting to use this notebook: generative-ai/gemini/rag-engine/intro_rag_engine.ipynb at main · GoogleCloudPlatform/generative-ai · GitHub
It works up until “Upload a local file to the corpus” at which point it dies no matter if I try to import from Cloud Storage, from Drive, from local file, doesn’t matter it fails. As an example error I see:
_OperationNotComplete Traceback (most recent call last)
/usr/local/lib/python3.12/dist-packages/google/api_core/retry/retry_unary.py in retry_target(target, predicate, sleep_generator, timeout, on_error, exception_factory, **kwargs)
146 try:
--> 147 result = target()
148 if inspect.isawaitable(result):
8 frames
/usr/local/lib/python3.12/dist-packages/google/api_core/future/polling.py in _done_or_raise(self, retry)
119 if not self.done(retry=retry):
--> 120 raise _OperationNotComplete()
121
_OperationNotComplete:
The above exception was the direct cause of the following exception:
RetryError Traceback (most recent call last)
/usr/local/lib/python3.12/dist-packages/google/api_core/future/polling.py in _blocking_poll(self, timeout, retry, polling)
136 try:
--> 137 polling(self._done_or_raise)(retry=retry)
138 except exceptions.RetryError:
/usr/local/lib/python3.12/dist-packages/google/api_core/retry/retry_unary.py in retry_wrapped_func(*args, **kwargs)
293 )
--> 294 return retry_target(
295 target,
/usr/local/lib/python3.12/dist-packages/google/api_core/retry/retry_unary.py in retry_target(target, predicate, sleep_generator, timeout, on_error, exception_factory, **kwargs)
155 # defer to shared logic for handling errors
--> 156 next_sleep = _retry_error_helper(
157 exc,
/usr/local/lib/python3.12/dist-packages/google/api_core/retry/retry_base.py in _retry_error_helper(exc, deadline, sleep_iterator, error_list, predicate_fn, on_error_fn, exc_factory_fn, original_timeout)
228 )
--> 229 raise final_exc from source_exc
230 _LOGGER.debug(
RetryError: Timeout of 600.0s exceeded, last exception:
During handling of the above exception, another exception occurred:
TimeoutError Traceback (most recent call last)
/tmp/ipython-input-xxxxx.py in <cell line: 0>()
3 )
4
----> 5 response = rag.import_files(
6 corpus_name=rag_corpus.name,
7 paths=[INPUT_GCS_BUCKET],
/usr/local/lib/python3.12/dist-packages/vertexai/rag/rag_data.py in import_files(corpus_name, paths, source, transformation_config, timeout, max_embedding_requests_per_min, import_result_sink, partial_failures_sink, layout_parser, llm_parser)
624 raise RuntimeError("Failed in importing the RagFiles due to: ", e) from e
625
--> 626 return response.result(timeout=timeout)
627
628
/usr/local/lib/python3.12/dist-packages/google/api_core/future/polling.py in result(self, timeout, retry, polling)
254 """
255
--> 256 self._blocking_poll(timeout=timeout, retry=retry, polling=polling)
257
258 if self._exception is not None:
/usr/local/lib/python3.12/dist-packages/google/api_core/future/polling.py in _blocking_poll(self, timeout, retry, polling)
137 polling(self._done_or_raise)(retry=retry)
138 except exceptions.RetryError:
--> 139 raise concurrent.futures.TimeoutError(
140 f"Operation did not complete within the designated timeout of "
141 f"{polling.timeout} seconds."
TimeoutError: Operation did not complete within the designated timeout of 600 seconds.
The exact same thing happens with all import/upload options in the Web UI of RAG Engine. I have given the developer.gserviceaccount.com and gcp-sa-vertex-rag.iam.gserviceaccount.com ownership permissions over the project. The project is accurately updated in the notebook and I’m definitely in the project on the web UI. I have attempted to upload a whole manner of different files including the stock file the intro_rag_engine.ipynb notebook offers up. Nothing at all is mattering. Then, for many hours after I create the corpus, I cannot delete any of my test corpuses due to this error:
ErrorResponse: {“errorParameters”:{“map”:{}},“url”:“https://us-east4-aiplatform.clients6.google.com/ui/projects/123123123123/locations/us-east4/ragCorpora/123123123123?key=xxxxxx",“headers”:{},“status”:400,“statusText”:“OK”,“method”:“DELETE”,“body”:{“error”:{“code”:400,“message”:"There are other operations running on the RagCorpus \“projects/123123123123/locations/us-east4/ragCorpora/123123123123\”. Operation IDs are: [123123123123].”,“status”:“FAILED_PRECONDITION”}},“bodyText”:“{\“error\”:{\“code\”:400,\“message\”:\“There are other operations running on the RagCorpus \\\“projects/123123123123/locations/us-east4/ragCorpora/123123123123\\\”. Operation IDs are: [123123123123].\”,\“status\”:\“FAILED_PRECONDITION\”}}”,“errorExperience”:1,“clientHandler”:0,“trackingId”:“c4689004526678377”,“message”:“There are other operations running on the RagCorpus \“projects/12312123123/locations/us-east4/ragCorpora/123123123123\”. Operation IDs are: [8986879730103877632].”,“errorCode”:400}
What am I supposed to do if the most basic file upload example in RAG Engine doesn’t work at all no matter the way I attempt to import or export? Just timeout after timeout no matter how small the uploaded file is. I have confirmed they are in the proper formatting like md or jsonl.