Measured against git
mkit names every object by a BLAKE3 hash and splits large files into content-defined chunks, so changing one megabyte of a video means storing one megabyte — git hashes with SHA-1 and stores each version of a file whole until a repack. That trade cuts both ways, and the numbers below show both edges: real hyperfine runs of the two CLIs on one machine, git wins included.
Time, end to end
Wall-clock time for whole CLI invocations, mean of repeated runs. Lower is better.
init an empty repository
about evenmkit init vs git init in a fresh directory.
add + commit 100 small files
mkit 1.3× faster100 files of 10 KiB random bytes each, staged and committed in one shot.
mkit wins by ~1.3× while making every commit crash-durable (git does not fsync loose objects by default). Durability is batched: all objects in a command are staged invisibly, flushed behind two fixed full flushes, then renamed into place — git’s core.fsyncMethod=batch design, on by default.
add + commit one 100 MiB file
mkit 3.1× fasterA single 100 MiB file of incompressible bytes (a stand-in for video or other compressed media).
mkit wins by ~3.1×. The file splits into ~1,600 content-defined chunks that are hashed with BLAKE3, written zero-copy, and barrier-synced from a thread pool; git’s SHA-1 + zlib pass is CPU-bound. mkit’s flush cost is constant per commit, not per chunk.
add + commit one 1 GiB file
mkit 4.2× fasterSame shape at 1 GiB, 3 runs each.
mkit wins by ~4.2×. First ingest scales linearly for both tools; mkit’s wall clock is I/O + BLAKE3.
commit a 1 MiB change to the 100 MiB file
mkit 7.9× fasterAppend 1 MiB to the already-committed 100 MiB file, then add + commit the new version.
mkit wins by ~7.9×. This is where content-defined chunking pays: mkit re-hashes the file but only stores the chunks that changed, so the second version costs about a megabyte. git re-compresses and stores the whole 101 MiB blob again.
re-add an unchanged 100 MiB file
about eventouch the committed file (mtime changes, bytes don’t) and run add again — a pure re-hash.
Close to a tie: the changed mtime invalidates both tools’ stat caches, so both re-read and re-hash 100 MiB in under 200 ms and write nothing new.
status with an unchanged 100 MiB file
about evenmkit status / git status in a clean repo holding the committed 100 MiB file, stat cache warm.
A tie. The index carries a stat cache (mtime, size, inode, ctime — with git’s racy-clean rule), so an unchanged file is proven clean by one stat call: O(stat), no read, no hash — the same trick git plays.
Bytes on disk
Repository directory size (du -k .mkit vs .git) after the same operations. Lower is better. Rows marked git* are after git gc.
100 small files, one commit
Repository size after committing 100 × 10 KiB of random bytes (1,000 KiB of content).
Effectively a tie — both store roughly the content plus per-object overhead.
one 100 MiB file, one commit
Repository size after the first commit of the 100 MiB file.
Roughly even: incompressible input means zlib buys git nothing, so both stores hold roughly the content plus bookkeeping (the gap is mostly filesystem allocation, not format).
growth after a 1 MiB change
Additional repository bytes after appending 1 MiB to the 100 MiB file and committing the second version.
The interesting one. mkit stores ~1.1 MiB immediately: the appended megabyte (incompressible — a floor no store can beat), one re-cut boundary chunk, and a fresh chunk manifest. git’s loose store duplicates the whole ~112 MiB blob until you run git gc, which then repacks both versions — the old one as a tiny delta against the new — netting the growth to ~zero against the loose baseline. mkit’s store is incremental by construction, no maintenance pass required; git’s density arrives only after one.
Methodology & caveats
- Signed vs unsigned: every mkit commit is Ed25519-signed (the key comes from mkit keygen in the prepare step); the git side runs unsigned, as git defaults to. Signing costs mkit well under a millisecond per commit, but the comparison is asymmetric and you should know that.
- Durability: mkit batches each command’s object writes behind two fixed full flushes plus per-file write barriers (SPEC-OBJECTS §10.1) — a commit is durable when the command returns, and no ref ever references non-durable objects. git does not fsync loose objects by default, so every row above has mkit doing strictly more durability work. Per-object flushing is available via the durability.objects = per-object config key.
- One machine, one filesystem, one day. Ratios on spinning disks, network filesystems, or Linux will differ — flush cost in particular is very hardware-dependent.
- Both tools were run through their CLI end to end (process spawn included), with stock configuration: no git core.fsmonitor, no mkit tuning.
Exact commands
# the whole suite is reproducible from the repo root: cargo build --release -p mkit-cli # in rust/ scripts/bench-vs-git.sh # hyperfine JSON + sizes into ./bench-results