ML.GENERATE_TEXT queries against Gemini models directly from BigQuery.
Time estimate: ~15 minutes
Prerequisites
- A GCP project with BigQuery enabled
roles/ownerorroles/bigquery.admin+roles/aiplatform.adminon the project
Step 1: Enable the Vertex AI API
- Go to APIs & Services in the Google Cloud Console
- Select your project
- Search for “Vertex AI API”
- Click on it and click Enable (if not already enabled)
Step 2: Create a Vertex AI Connection in BigQuery
- In the BigQuery console explorer, click ”+ Add Data”
- Search for “Vertex AI” and select it
- Select “BigQuery Federation”
- Name it
vertex-ai-conn(or any name you prefer) - Set the location to match your data (e.g.,
US) - Click “Create Connection”
Step 3: Grant the Connection’s Service Account Vertex AI Access
- In BigQuery console explorer, expand External connections under your project
- Click on the
vertex-ai-connconnection you just created - Copy the Service account id (looks like
bqcx-123456789-abcd@gcp-sa-bigquery-condel.iam.gserviceaccount.com) - Go to IAM & Admin in the Google Cloud Console
- Click “Grant Access”
- Paste the service account id in the “New principals” field
- In “Select a role”, search for and select “Vertex AI User”
- Click Save
This step occasionally fails on the first attempt saying the service account doesn’t exist. If that happens, wait a minute and try again — it usually works on the second attempt.
Step 4: Create the Remote LLM Model
- In BigQuery console, click on your project in the explorer
- Click the three dots next to your target dataset and select “Create ML Model”
- Name the model (e.g.,
gemini_flash) - Choose “Connect to Vertex AI LLM service and CloudAI services”
- In “Model Options” select “Google and Partner Models”
- In “Model Selection” search for and select “gemini-2.5-flash-lite”
- For the connection, select the
vertex-ai-connconnection you created in Step 2 - Click “Create Model”
your_project, your_dataset, and region with your actual values.
Step 5: Smoke Test
Run this query in the BigQuery console. If it returns a result, everything is working.ml_generate_text_llm_result containing 4.
Using Other Models
The same setup works for other Vertex AI models. Repeat Step 4 with a different model selection:| Use Case | Model | BigQuery Function |
|---|---|---|
| Text classification / generation | gemini-2.5-flash-lite | ML.GENERATE_TEXT |
| Text embeddings (similarity search) | text-embedding-005 | ML.GENERATE_EMBEDDING |
| Multilingual embeddings | text-multilingual-embedding-002 | ML.GENERATE_EMBEDDING |
CREATE MODEL statement, but they can all share the same Vertex AI connection.
Using with Malloy
There are three ways to use BigQuery ML models from Malloy, depending on your needs.Approach 1: Inline LLM Call with sql_string()
Calls the LLM at query time for each row. The fastest way to get started — no separate pipeline needed. Requires ##! experimental{sql_functions}.
This makes an LLM call per row at query time. Great for small datasets and prototyping. For large tables, use Approach 3 to batch-classify and join the results back.
Approach 2: Wrap ML Results as a Malloy Source with connection.sql()
Use connection.sql() to make a BigQuery ML query available as a joinable Malloy source. Useful when the ML output needs to be combined with other sources via joins.
Approach 3: Batch Classify via MalloySQL, Join Results Back
Best for classifying a large set of items once and reusing the results. Run a.malloysql file to materialize a classification table, then reference it as a Malloy source.
Step 1 — Materialize (in a .malloysql file):
.malloy file):
'Unclassified' via ??.