Airflow Dag for Vertex AI

Please advice if the below is correct on creating a Vertex AI through Cloud composer in the form of Airflow Dag.


from datetime import datetime
from airflow import DAG
from airflow.decorators import task
from google.cloud import aiplatform
from airflow.operators import CreateDatasetOperator

YESTERDAY = datetime.datetime.now() - datetime.timedelta(days=1)

default_dag_args = {

‘start_date’: YESTERDAY,
}

with models.DAG(
‘composer_sample_simple_greeting’,
schedule_interval=datetime.timedelta(weeks=2),
default_args=default_dag_args) as dag:

def create_entity_type_sample(
project: str,
location: str,
entity_type_id: str,
vertexai: str,
service_account_id: str
task_id: str,
project_id: str,

aiplatform.init(project=project, location=location)

my_entity_type = aiplatform.EntityType.create(
entity_type_id=entity_type_id, vertexai=vertexai
)

my_entity_type.wait()

return my_entity_type

create_image_dataset_job = CreateDatasetOperator(
task_id=“image_dataset”,
dataset=IMAGE_DATASET,
region=REGION,
project_id=PROJECT_ID,
)
create_tabular_dataset_job = CreateDatasetOperator(
task_id=“tabular_dataset”,
dataset=TABULAR_DATASET,
region=REGION,
project_id=PROJECT_ID,
)
create_text_dataset_job = CreateDatasetOperator(
task_id=“text_dataset”,
dataset=TEXT_DATASET,
region=REGION,
project_id=PROJECT_ID,
)
create_video_dataset_job = CreateDatasetOperator(
task_id=“video_dataset”,
dataset=VIDEO_DATASET,
region=REGION,
project_id=PROJECT_ID,
)
create_time_series_dataset_job = CreateDatasetOperator(
task_id=“time_series_dataset”,
dataset=TIME_SERIES_DATASET,
region=REGION,
project_id=PROJECT_ID,
)

create_image_dataset_job >> create_tabular_dataset_job >> create_text_dataset_job >> create_video_dataset_job >> create_time_series_dataset_job


Hey @anu9s ,

A couple of us at Google gave this a once over –

The code you have provided looks correct. It will create a Vertex AI project, create a dataset for each type of data, and then create a model for each dataset. The model will be trained on the data in the dataset.

The only thing to note is that the code will create a new Vertex AI project for each dataset. If you want to use the same Vertex AI project for multiple datasets, you will need to modify the code to create all of the datasets in the same project.

Other than that, the code looks good!