Why Embed Data?
Embedding data files in your packages is valuable when you need to:- Package sample data - Example datasets for testing or demos
- Build standalone models - Models that don’t require database connections
- Version control data - Keep data synchronized with model changes in your package
Currently, embedded data files work best for standalone models. Support for querying embedded data alongside database connections (e.g., joining embedded lookup tables with warehouse data) is coming soon.
Adding Data Files
File Structure
Create adata/
folder in your package directory and add your CSV or Parquet files:
Supported File Formats
- CSV files (
.csv
) - Comma-separated values with header row - Parquet files (
.parquet
) - Columnar binary format, efficient for larger datasets
Referencing Embedded Data in Models
Useduckdb.table()
to reference embedded files in your Malloy models:
Next Steps
Build a Model
Learn how to build semantic models in VS Code