Skip to main content
This guide walks through the one-time GCP setup needed to use LLM-based classification via BigQuery ML. Once complete, you can run ML.GENERATE_TEXT queries against Gemini models directly from BigQuery. Time estimate: ~15 minutes

Prerequisites

  • A GCP project with BigQuery enabled
  • roles/owner or roles/bigquery.admin + roles/aiplatform.admin on the project

Step 1: Enable the Vertex AI API

  1. Go to APIs & Services in the Google Cloud Console
  2. Select your project
  3. Search for “Vertex AI API”
  4. Click on it and click Enable (if not already enabled)

Step 2: Create a Vertex AI Connection in BigQuery

  1. In the BigQuery console explorer, click ”+ Add Data”
  2. Search for “Vertex AI” and select it
  3. Select “BigQuery Federation”
  4. Name it vertex-ai-conn (or any name you prefer)
  5. Set the location to match your data (e.g., US)
  6. Click “Create Connection”

Step 3: Grant the Connection’s Service Account Vertex AI Access

  1. In BigQuery console explorer, expand External connections under your project
  2. Click on the vertex-ai-conn connection you just created
  3. Copy the Service account id (looks like bqcx-123456789-abcd@gcp-sa-bigquery-condel.iam.gserviceaccount.com)
  4. Go to IAM & Admin in the Google Cloud Console
  5. Click “Grant Access”
  6. Paste the service account id in the “New principals” field
  7. In “Select a role”, search for and select “Vertex AI User”
  8. Click Save
This step occasionally fails on the first attempt saying the service account doesn’t exist. If that happens, wait a minute and try again — it usually works on the second attempt.

Step 4: Create the Remote LLM Model

  1. In BigQuery console, click on your project in the explorer
  2. Click the three dots next to your target dataset and select “Create ML Model”
  3. Name the model (e.g., gemini_flash)
  4. Choose “Connect to Vertex AI LLM service and CloudAI services”
  5. In “Model Options” select “Google and Partner Models”
  6. In “Model Selection” search for and select “gemini-2.5-flash-lite”
  7. For the connection, select the vertex-ai-conn connection you created in Step 2
  8. Click “Create Model”
Alternatively, run this SQL directly:
CREATE OR REPLACE MODEL `your_project.your_dataset.gemini_flash`
  REMOTE WITH CONNECTION `your_project.region.vertex-ai-conn`
  OPTIONS (ENDPOINT = 'gemini-2.5-flash-lite');
Replace your_project, your_dataset, and region with your actual values.

Step 5: Smoke Test

Run this query in the BigQuery console. If it returns a result, everything is working.
SELECT *
FROM ML.GENERATE_TEXT(
  MODEL `your_project.your_dataset.gemini_flash`,
  (SELECT 'What is 2 + 2? Reply with just the number.' AS prompt),
  STRUCT(0.0 AS temperature, 10 AS max_output_tokens, TRUE AS flatten_json_output)
);
Expected result: a row with ml_generate_text_llm_result containing 4.

Using Other Models

The same setup works for other Vertex AI models. Repeat Step 4 with a different model selection:
Use CaseModelBigQuery Function
Text classification / generationgemini-2.5-flash-liteML.GENERATE_TEXT
Text embeddings (similarity search)text-embedding-005ML.GENERATE_EMBEDDING
Multilingual embeddingstext-multilingual-embedding-002ML.GENERATE_EMBEDDING
Each model needs its own CREATE MODEL statement, but they can all share the same Vertex AI connection.

Using with Malloy

There are three ways to use BigQuery ML models from Malloy, depending on your needs.

Approach 1: Inline LLM Call with sql_string()

Calls the LLM at query time for each row. The fastest way to get started — no separate pipeline needed. Requires ##! experimental{sql_functions}.
##! experimental{sql_functions}

source: feedback is my_connection.table('my_dataset.feedback') extend {
  dimension:
    ai_sentiment is sql_string("""
      JSON_EXTRACT_SCALAR(
        (SELECT ml_generate_text_result
         FROM ML.GENERATE_TEXT(
           MODEL `my_project.my_dataset.gemini_flash`,
           (SELECT CONCAT(
             'Classify this feedback as: POSITIVE, NEGATIVE, or NEUTRAL.\n',
             'Feedback: ', ${comment}
           ) AS prompt),
           STRUCT(0.0 AS temperature, 10 AS max_output_tokens)
         )),
        '$.candidates[0].content.parts[0].text'
      )
    """)

  measure:
    feedback_count is count()

  view: by_sentiment is {
    group_by: ai_sentiment
    aggregate: feedback_count
  }
}
This makes an LLM call per row at query time. Great for small datasets and prototyping. For large tables, use Approach 3 to batch-classify and join the results back.

Approach 2: Wrap ML Results as a Malloy Source with connection.sql()

Use connection.sql() to make a BigQuery ML query available as a joinable Malloy source. Useful when the ML output needs to be combined with other sources via joins.
source: item_embeddings is my_connection.sql("""
  SELECT *
  FROM ML.GENERATE_EMBEDDING(
    MODEL `my_project.my_dataset.embedding_model`,
    (SELECT item_name AS content, item_id FROM `my_project.my_dataset.items`),
    STRUCT(TRUE AS flatten_json_output, 'SEMANTIC_SIMILARITY' AS task_type)
  )
""")

source: items is my_connection.table('my_dataset.items') extend {
  join_one: item_embeddings on item_id = item_embeddings.item_id

  dimension:
    embedding is item_embeddings.ml_generate_embedding_result
}

Approach 3: Batch Classify via MalloySQL, Join Results Back

Best for classifying a large set of items once and reusing the results. Run a .malloysql file to materialize a classification table, then reference it as a Malloy source. Step 1 — Materialize (in a .malloysql file):
-- connection:my_connection

CREATE OR REPLACE TABLE `my_project.my_dataset.item_classification` AS
SELECT
  src.item_name,
  TRIM(llm.ml_generate_text_llm_result) AS category
FROM ML.GENERATE_TEXT(
  MODEL `my_project.my_dataset.gemini_flash`,
  (SELECT *, CONCAT(
    'Classify this item as: Electronics, Clothing, Food, or Other.\n',
    'Item: "', item_name, '"\nReturn one word only.'
  ) AS prompt FROM `my_project.my_dataset.distinct_items`),
  STRUCT(0.0 AS temperature, 10 AS max_output_tokens, TRUE AS flatten_json_output)
) AS llm
JOIN `my_project.my_dataset.distinct_items` AS src ON src.item_name = llm.item_name;
Step 2 — Join into your model (in a .malloy file):
source: item_classification is my_connection.table('my_dataset.item_classification') extend {
  primary_key: item_name
}

source: orders is my_connection.table('my_dataset.orders') extend {
  join_one: item_classification with item_name

  dimension:
    category is item_classification.category ?? 'Unclassified'

  measure:
    order_count is count()

  view: by_category is {
    group_by: category
    aggregate: order_count
    order_by: order_count desc
  }
}
Items not yet in the classification table fall back to 'Unclassified' via ??.