Back to browse
GitHub Repository

Query the full Hacker News archive from Postgres via duckdb_fdw, with zero copies. Stream row groups straight from the Hugging Face Parquet dataset on demand.

4 starsPython

HN-fdw – All of Hacker News, queryable from Postgres, with zero copies

by tamnd·Apr 8, 2026·2 points·0 comments

AI Analysis

●●●BangerWizardryBig BrainDark Horse

Zero-copy Postgres queries against 47M rows using DuckDB FDW and HTTP range requests.

Strengths
  • HTTP range requests on Parquet files mean kilobytes downloaded, not megabytes per query.
  • No ETL pipeline, no cron jobs, no data duplication — genuinely zero-copy architecture.
  • Column projection and predicate pushdown both work through the FDW layer.
Weaknesses
  • First-run bootstrap takes ~2 minutes to open every Parquet file in the dataset.
  • Niche use case — only matters if you need HN data specifically in Postgres.
Category
Target Audience

Data engineers and analysts working with Postgres

Similar To

sql.js-httpvfs · MotherDuck · DuckDB

Similar Projects