Posts tagged "duckDB"

Custom DuckDB Wasm builds for Cloudflare Workers

What if you could run full SQL queries, including JOINs, aggregations, and even remote Parquet file reads, directly inside a Cloudflare Worker? No database server, no connection pool, no cold-start latency from external services. Just DuckDB, compile...

January 27, 2026

8 min read

duckDB cloudflare-worker wasm

Using Iceberg Catalogs in the Browser with DuckDB-Wasm

With recent updates of DuckDB itself, DuckDB-Wasm and the Iceberg extension it is now possible to query Iceberg catalog directly from the browser, with no backends. involved. Example clients that work: SQL Workbench SQL Workbench Embedded DuckDB T...

December 16, 2025

5 min read

duckDB iceberg wasm

TypeScript scripts as DuckDB Table Functions

What if you could query any REST API, GraphQL endpoint, or web page directly from DuckDB using SQL? No ETL pipelines, no intermediate files, no complex setup - just write a TypeScript script and use it as a table function. In this post, I'll show you...

December 10, 2025

7 min read

duckDB TypeScript arrow

Using Amazon SageMaker Lakehouse with DuckDB

Preconditions To use the Amazon SageMaker Lakehouse with DuckDB, you first have to create a S3 Table bucket, a namespace and an actual S3 Table. All those steps are described in my other blog post “Query S3 Tables with DuckDB”, so please make sure yo...

June 8, 2025

5 min read

sagemaker duckDB glue

Welcome to the age of $10/month Lakehouses

Recap: Data Warehouses, Data Lakes, Lakehouses? As a short recap, what do these mean, and how are they differentiated? Modern Data Warehouses, like Amazon Redshift, Google BigQuery, and Snowflake, offer fast, SQL-optimized performance for structured ...

May 30, 2025

18 min read

lakehouse Data-lake duckDB

Using DuckDB databases as lightweight Data Lake access layer

Data Lakes come in a broad variety and lots of different flavors. AWS, Azure, Google Cloud, Snowflake, DataBricks, etc. they all have their specialties, strong and weak sides. Common among them is that the most, if not all, of them use Object Storage...

May 17, 2025

15 min read

duckDB Data-lake analytics

Handling GTFS data with DuckDB

The General Transit Feed Specification (GTFS) is a standardized, open data format for public transportation schedules and geographic information. In practice, a GTFS feed is simply a ZIP archive of text (CSV) tables - such as stops.txt, routes.txt, a...

May 16, 2025

8 min read

gtfs duckDB data-engineering

Querying IP addresses and CIDR ranges with DuckDB

I had a use case that eventually required performing IP address lookups in a given list of CIDR ranges, as I maintain an open source project that gathers IP address range data from public cloud providers, and also wrote an article in my blog about an...

September 20, 2024

2 min read

duckDB ip address CIDR

Chat with a Duck

A while ago I published sql-workbench.com and the accompanying blog post called "Using DuckDB-WASM for in-browser Data Engineering". The SQL Workbench enables its users to analyze local or remote data directly in the browser. This lowers the bar rega...

April 16, 2024

5 min read

duckDB llm AI

Using DuckDB-WASM for in-browser Data Engineering

Introduction DuckDB, the in-process DBMS specialized in OLAP workloads, had a very rapid growth during the last year, both in functionality, but also popularity amongst its users, but also with developers that contribute many projects to the Open Sou...

January 27, 2024

18 min read

data-engineering wasm SQL

Gathering and analyzing public cloud provider IP address data with DuckDB & Observerable

As organizations increasingly adopt the public cloud, managing the networking and security aspects of cloud computing becomes more complex. One of the challenges that cloud administrators face is, especially in a hybrid cloud environment, keeping tra...

April 26, 2023

8 min read

duckDB dataengineering free

Casual data engineering, or: A poor man's Data Lake in the cloud - Part I

In the age of big data, organizations of all sizes are collecting vast amounts of information about their operations, customers, and markets. To make sense of this data, many are turning to data lakes - centralized repositories that store and manage ...

April 24, 2023

20 min read

AWS serverless datalake

Using DuckDB to repartition parquet data in S3

Since release v0.7.1, DuckDB has the ability to repartition data stored in S3 as parquet files by a simple SQL query, which enables some interesting use cases. Why not use existing AWS services? If your data lake lives in AWS, a natural choice for ET...

February 26, 2023

5 min read

duckDB Amazon S3 Data-lake

Using DuckDB in AWS Lambda

Prelude DuckDB is an open-source in-process SQL OLAP database management system that has recently gained significant public interest due to its unique architecture and impressive performance benchmarks. Unlike traditional databases that are designed ...

February 12, 2023

6 min read

AWS duckDB serverless

Posts tagged with duckDB

Filter by tag:

Custom DuckDB Wasm builds for Cloudflare Workers

Using Iceberg Catalogs in the Browser with DuckDB-Wasm

TypeScript scripts as DuckDB Table Functions

Using Amazon SageMaker Lakehouse with DuckDB

Welcome to the age of $10/month Lakehouses

Using DuckDB databases as lightweight Data Lake access layer

Handling GTFS data with DuckDB

Querying IP addresses and CIDR ranges with DuckDB

Chat with a Duck

Using DuckDB-WASM for in-browser Data Engineering

Gathering and analyzing public cloud provider IP address data with DuckDB & Observerable

Casual data engineering, or: A poor man's Data Lake in the cloud - Part I

Using DuckDB to repartition parquet data in S3

Using DuckDB in AWS Lambda