Skip to main content
Key syntax and examples for the Malloy data modeling language.

Basic Syntax

Sources

Define data sources and their relationships:
source: orders is table('dw.orders') {
  primary_key: order_id
  
  measure: 
    order_count is count()
    total_revenue is sum(amount)
    avg_order_value is total_revenue / order_count
    
  dimension:
    order_date is date_trunc('day', created_at)
    customer_segment is case
      when amount > 1000 then 'High Value'
      when amount > 100 then 'Medium Value'  
      else 'Low Value'
    end
}

Joins

Define relationships between sources:
source: orders is table('dw.orders') {
  join_one: customers is table('dw.customers') on customer_id
  join_many: order_items is table('dw.order_items') on order_id
  
  measure: orders_with_items is count() { where: order_items.item_count > 0 }
}

Query Syntax

Basic Queries

query: orders -> {
  group_by: order_date
  aggregate: order_count, total_revenue
}

Filtering

query: orders -> {
  where: order_date >= '2024-01-01'
  group_by: customer_segment
  aggregate: order_count
}

Nested Queries

query: orders -> {
  group_by: order_date
  aggregate: order_count, total_revenue
  nest: by_segment is {
    group_by: customer_segment  
    aggregate: order_count
  }
}

Data Types

Primitive Types

  • string - Text values
  • number - Numeric values (integer or decimal)
  • date - Date values
  • timestamp - Date and time values
  • boolean - True/false values

Aggregate Types

  • count() - Row count
  • sum(field) - Sum of numeric field
  • avg(field) - Average of numeric field
  • min(field) - Minimum value
  • max(field) - Maximum value

Functions

String Functions

dimension:
  name_upper is upper(customer_name)
  name_concat is concat(first_name, ' ', last_name)
  name_length is char_length(customer_name)

Date Functions

dimension:
  order_month is date_trunc('month', order_date)
  days_since_order is date_diff('day', order_date, now())
  order_year is extract('year', order_date)

Numeric Functions

measure:
  revenue_rounded is round(sum(amount), 2)
  revenue_abs is abs(sum(profit_loss))
  revenue_ceiling is ceil(sum(amount))

Advanced Features

Window Functions

dimension:
  running_total is sum(amount) { 
    order_by: order_date 
    rows: unbounded preceding
  }

Case Expressions

dimension:
  order_size is case
    when amount > 1000 then 'Large'
    when amount > 100 then 'Medium'
    else 'Small'
  end

Templating

source: metric_template is {
  parameter: metric_field is 'amount'
  
  measure: total is sum(${metric_field})
  measure: average is avg(${metric_field})
}

Best Practices

  • Use meaningful names for sources, measures, and dimensions
  • Group related measures and dimensions together
  • Document complex calculations with comments
  • Use consistent naming conventions
  • Leverage joins for reusable relationships
  • Test queries incrementally during development
I