Changelog
All notable changes to this project will be documented in this file.
[Unreleased]
New Features
-
Storage cap retention (
--max-storage) — optionally bound caphouse-managed ClickHouse storage to a target size. After each successful ingest, caphouse measures compressed bytes across its ownpcap_*andstream_*tables and prunes whole oldest captures until usage drops under the configured cap, keeping the newest just-ingested capture. -
caphouse-sanitize— new companion tool that rewrites public IPv4, IPv6, and MAC addresses in a PCAP with deterministic HMAC-SHA256 pseudonyms. Private, loopback, link-local, and multicast addresses are left unchanged. Accepts a file, a folder (all*.pcapfiles, filenames preserved), or stdin/stdout. A random seed is generated by default and printed to stderr; pass--seed(64 hex chars) to reproduce a mapping. Same address always maps to the same pseudonym within a run. -
Schema migration system —
InitSchemanow tracks applied SQL migrations in acaphouse_schema_migrationstable and applies new ones in order on startup. If the database is ahead of the binary's compiled-in migrations (i.e. a downgrade is attempted), an error is returned before any data is touched.
Performance
- Paginated export — exports of large captures (millions of packets) now
use keyset (cursor) pagination rather than SQL
OFFSET, eliminating connection timeouts and O(N) scans on wide joins. Each page is a bounded O(log N) query; page size is 50 000 packets.
Changes
--query/-qrenamed to--filter/-f— the flag now accepts a ClickHouse SQLWHEREclause directly rather than a tcpdump-style expression. Component table references (ipv4.src,tcp.dst, etc.) are joined automatically; field aliases (ipv4.addr,tcp.port, …) expand to(src OR dst)checks. Time bounds moved to dedicated--from/--toflags.--capture allrequires--fromand--to— previously the time range was embedded in the-qexpression astime <from> to <to>; it is now supplied as separate RFC 3339 flags.
Fixes
--max-storagesize parsing — distinguishB(bytes) fromb(bits), so values like500MBand500Mbno longer resolve to the same threshold.- DNS byte-exact export — preserve the DNS
Zheader bits during ingest and reconstruction so captures with non-zero reserved/AD/CD-style DNS flags round-trip byte-for-byte.
Documentation
- Improved
caphouse-uicoverage — the README and quickstart now show the browser UI more clearly, including its packet search, stream inspection, SQL workbench, and admin re-encode workflow, and thecaphouse-uisubproject README now matches the current API paths and filter model. - Deployment guide clarified — the docs now distinguish the devcontainer
stack, which includes ClickHouse for the fastest local startup, from the
production Dockerfiles/compose setup, which builds
caphouse-apiandcaphouse-uibut expects an external ClickHouse instance.
[v0.3.1] - 2026-03-18
New Features
caphouse-watch-dirscript — watches a directory for PCAP files (*.pcap,*.pcapng), ingests each into ClickHouse when the writing process closes the file, and removes it from disk on success. Useful as a drop folder: any tool that writes PCAPs into the directory will have them automatically drained and stored. Usesinotifywaiton Linux and polling +lsofon macOS. Installed alongsidecaphouse-monitorviacaphouse install-scripts.
[v0.3.0] - 2026-03-11
New Features
- PCAPng ingest support — PCAPng files are accepted as input and converted to classic PCAP on ingest. All non-packet blocks (metadata, interface descriptions, etc.) are discarded. No byte-exact round-trip is guaranteed for PCAPng sources; the exported result is always a valid classic PCAP stream.
- Multi-file ingest — input files are now positional arguments; multiple
files and glob patterns are accepted in a single invocation (e.g.
caphouse -d "..." ring*.pcap). The--file/-fflag has been removed. - Cross-capture export (
--capture all) — pass-c allwith a mandatorytime <from> to <to>filter to merge packets from every stored capture into a single time-sorted PCAP stream. Ties are broken by capture start time, then capture ID. A warning is emitted when captures have mixed link types. - L7 protocol parsing — DNS, NTP, and HTTP are parsed and stored per packet; TCP stream reassembly enables HTTP reconstruction across multiple packets.
--no-streamsflag — disables TCP stream tracking and L7 protocol detection during ingest. Useful for high-throughput scenarios where stream reassembly is not required.- Documentation site — full MkDocs-based documentation published at https://cochaviz.github.io/caphouse/, covering quickstart, filter syntax, storage internals, and a complete flag reference.
Library API
IngestPCAPStreampromoted to public API — previously an internal CLI helper, it is now a public method on*Client. Transparently handles both classic PCAP and PCAPng input and is the single entry point for stream-based ingest.ExportAllCapturesFiltered— new*Clientmethod for cross-capture filtered export; returns anio.ReadCloserand total packet count. Requires a query containing atimeprimitive.GenerateSQLForCaptures— likeGenerateSQLbut accepts[]uuid.UUID; passnilto generate SQL spanning all captures without an explicit capture filter.
Changes
-c allis valid in write (-w) and query (-q) modes; forbidden in read (-r) mode.- SQL subqueries for cross-capture queries omit the
capture_id IN (...)clause entirely when operating over all captures, avoiding a pre-fetch round-trip. timeNodesubqueries add aPREWHEREclause on the captures join for ClickHouse granule-level pruning, improving time-range query performance.captures_schema.sql:created_atprecision raised fromDateTime64(3)toDateTime64(9)(nanosecond);time_rescolumn changed fromEnum8('us' = 1)toLowCardinality(String)to accommodate"ns"captures.CreatedAton a capture is now derived from the first packet's timestamp rather than the wall-clock time at ingest start.- A
Warn-level log is emitted when exporting a capture whose original PCAP global header was not preserved (i.e. any pcapng-sourced capture), indicating that the exported header is synthetic. - Component interface simplified;
RawTailComponentremoved.
Testing
- Test suite reorganised into three explicit tiers:
- Unit (
*_test.go, no build tag) — pure in-memory, no files, no external dependencies. - Integration (
*_integration_test.go,//go:build integration) — usestestdata/fixtures with the mock client; no container required. - E2E (
*_e2e_test.go,//go:build e2e) — requires a running ClickHouse container via testcontainers.
[v0.2.0] - 2026-03-07
New Features
- Query filtering (
-q/--query) — filter captured data using a simple DSL similar to BPF (e.g.host 1.1.1.1 and port 53 and time <begin> to <end>) when retrieving with-w - Standalone
-qusage — runningcaphouse -q <query> --captureprints the SQL query to stdout, enabling direct piping toclickhouse-clientfor inspection - Scripts now bundled in the binary;
go installworks directly (Makefile no longer required for installation) - ClickHouse client added to devcontainer
- Banner added to the manual
Compression Improvements
- Significant compression improvements across Ethernet, IPv4/6, and
pcap_packetstables
Documentation
- README and manual updated to reflect all new features and CLI changes
[v0.1.1] - 2026-03-04
New Features
- TCP/UDP (L4) support — packet capture now parses and stores TCP and UDP protocol layers in dedicated tables
--versionflag — report the installed version- Simplified CLI interface
- Devcontainer configuration added; user installs now go to
/home/vscode/.local/bin
Testing
- Test data moved to an external location and downloaded on demand
- Integration tests also download PCAPs automatically
- Compression tests now export results in CSV
- Integration test timeout increased from 5 to 7 minutes to reduce flakiness
- Fixed tests failing due to hardcoded test file names
Performance
- Reduced query size by bundling successive packet IDs
- Slight improvements to compression
[v0.1.0] - 2026-03-02
- Initial release