ethereum mev forensics agent

docs

read-only python 3.9+ rust stable mit ethereum mainnet

overview

hakiri watches every block on ethereum mainnet, decodes the swap logs, and reconstructs bundles. it classifies the events — sandwich, jit, backrun, liquidation, atomic arb — attributes builders and known searchers, and emits a verdict per event with a confidence in [0.0, 0.95].

detector, not oracle. read-only by design. no wallet, no signer, no executor, no trading. anyone forking to build a sniper does so in their own repo.

how it works

a pipeline of small, independent layers. you can swap any layer without touching the others. the contract between layers is the Event dataclass in src/hakiri/core/types.py.

   pending mempool                  finalized blocks
        │                                  │
        ▼                                  ▼
  ┌──────────────┐                  ┌──────────────┐
  │ ingest-rs    │                  │ trace_block  │
  │ ws subscribe │                  │ debug_trace  │
  └──────┬───────┘                  └──────┬───────┘
         │                                 │
         └──────────────┬──────────────────┘
                        ▼
                ┌───────────────┐
                │ decode  v2/v3 │   uniswap, sushi, balancer pools
                └──────┬────────┘
                       ▼
                ┌──────────────┐
                │ classify     │   SAND-01, BACK-01, JIT-01, ARB-01
                └──────┬───────┘
                       ▼
                ┌──────────────┐
                │ score        │   rule-based, capped at 0.95
                └──────┬───────┘
                       ▼
                ┌──────────────┐
                │ ai filter    │   optional. only reviews edge cases
                └──────┬───────┘
                       ▼
                ┌──────────────┐
                │ output sinks │   stdout · jsonl · webhook
                └──────────────┘

supported

target	status
ethereum mainnet	primary
uniswap v2 swap logs	primary
uniswap v3 swap logs	primary
sushiswap v2 (alias)	primary
flashbots relay	primary
ultrasound relay	primary
balancer v2 vault	low coverage
curve pools	low coverage
L2: base, arbitrum	planned
reth `mev` namespace	planned

detection heuristics

each rule has a numbered id used in event.notes so you can trace what fired. new rules ship with both a positive and a negative fixture or they do not get merged.

id	what it catches	shipped in
`SAND-01`	classic sandwich: front + victim + back same pool, opposite directions	v0.1
`BACK-01`	one-step backrun arb against a user swap	v0.1
`JIT-01`	just-in-time liquidity add+remove around a victim swap	v0.2
`ARB-01`	atomic multi-hop arb across 3+ pools in a single tx	v0.2
`LIQ-01`	aave/compound liquidation w/ priority manipulation	v0.3
`ORACLE-01`	sandwich-style prep around an oracle update tx	v0.3

SAND-01 — classic sandwich

three swaps on the same pool: a front-run by a searcher, a victim swap, a back-run by the same searcher. front and back run in opposite directions.

front.pool == victim.pool == back.pool
front.sender == back.sender != victim.sender
front.token_in == victim.token_in and front.token_out == victim.token_out
back.token_in == victim.token_out and back.token_out == victim.token_in
front.tx_index < victim.tx_index < back.tx_index

scoring: base 0.70. coinbase-transfer raises by +0.10. presence of a victim record raises by +0.05.

BACK-01 — one-step backrun

two consecutive swaps on the same pool by different senders, same direction. the second swap is treated as an arb candidate against the first.

a.pool == b.pool
a.sender != b.sender
a.token_in == b.token_in and a.token_out == b.token_out
a.tx_index + 1 == b.tx_index

scoring: base 0.50. coinbase-transfer raises by +0.10.

JIT-01 — just-in-time liquidity

uniswap v3 only. a position is opened (mint) and closed (burn) in the same block, surrounding a victim swap. the searcher captures fees from the victim and rebalances out before the block ends.

ARB-01 — atomic multi-hop arb

a single transaction touches three or more pools and ends with a profit denominated in the input token. typical on stable triangles (weth → usdc → wbtc → weth).

LIQ-01 — liquidation with priority manipulation

aave or compound liquidation where the searcher front-runs the liquidationCall with a price-impact swap that pushes the victim past threshold.

ORACLE-01 — oracle-update sandwich

same shape as SAND-01, but the victim is an oracle-update transaction. the searcher positions before the update lands and exits after.

why each rule has a numbered id

users get a verdict; auditors need a reason. quoting a rule id in a forum post or a postmortem is more useful than re-explaining the heuristic each time.

quickstart

requires python 3.9+ and rust stable.

git clone https://github.com/hakiriagent/hakiri.git
cd hakiri

# install python package + dev deps
make install

# build the rust ingest crate
cd ingest-rs && cargo build --release && cd ..

# run an offline demo (zero network)
hakiri demo investigate

# run the live scanner against your rpc
cp .env.example .env
$EDITOR .env   # set HAKIRI_WS_URL or HAKIRI_HTTP_URL
hakiri scan

no rpc? hakiri demo scan shows the full pipeline against synthetic fixtures.

cli reference

hakiri version                    # version + active config
hakiri scan                       # live mempool + block scan
hakiri scan --once                # one block then exit (smoke test)
hakiri investigate <tx|block>     # walk the pipeline on a specific target

hakiri demo scan                  # canned scan against a synthetic block
hakiri demo investigate           # full pipeline trace, prints every step
hakiri demo replay <id>           # replay a recorded fixture

example output — `hakiri demo investigate`

─────────────────── step 1. decoded swaps ────────────────────
  block 21000000 idx 0  sender 0xa69BabE...  pool 0x88e6A0c2...  in 8000000000000000000  out 24000000000
  block 21000000 idx 1  sender 0xCAFE8888...  pool 0x88e6A0c2...  in 2000000000000000000  out 5950000000
  block 21000000 idx 2  sender 0xa69BabE...  pool 0x88e6A0c2...  in 24500000000  out 8300000000000000

─────────────────── step 2. classifier rules ─────────────────
  rules fired: ['SAND-01']
  block verdict: likely
  events found:  1

─────────────────── step 3. score per event ─────────────────
  sandwich block 21000000
    base[sandwich]=0.70
    coinbase_transfer>0:+0.10
    bundle.txs>=2:+0.05
    victims_present:+0.05
    -> verdict=confirmed conf=0.900
─────────────────────── done ────────────────────────────────

architecture

hakiri/
├── src/hakiri/                    python package
│   ├── core/                      types · classify · score
│   ├── decode/                    uniswap v2/v3 + router labels
│   ├── enrich/                    builder · searcher · coinbase transfer
│   ├── ingest/                    rpc + mempool + trace stubs (rust later)
│   ├── output/                    stdout · jsonl · webhook
│   ├── ai/                        optional rule reviewer
│   ├── demo/                      offline scripted demos
│   └── cli.py                     typer entrypoint
├── ingest-rs/                     low-level ingest crate (rust)
│   └── src/ {mempool,bundle,trace}.rs
├── tests/                         pytest suite
└── docs/                          architecture · heuristics · glossary

zones & maintainers

zone	language	maintainer
core, scoring, ci	python	`@hakiriagent`
ingest-rs, mempool	rust	`@0xnova`
classify, heuristics	python	`@mikrohash`
decode, output	python	`@luka`

why polyglot

mev forensics is bottlenecked by mempool latency. python is fast enough for classification and scoring (a block is at most a few hundred logs) but a poor fit for a high-frequency websocket loop. rust handles the hot path; python handles the analysis. they meet at the Event boundary.

pipeline layers

ingest

two sources feed the pipeline:

mempool (pre-inclusion). a websocket subscription to pending transactions. used to detect bundles before they land.
finalized blocks (post-inclusion). either eth_getLogs (cheap, partial) or debug_traceBlockByNumber / trace_block (rich, expensive).

the live ingest path runs in ingest-rs/. python has stubs at src/hakiri/ingest/{mempool,builder,trace}.py so the rest of the pipeline can be exercised without a node.

decode

receipts and logs are converted into SwapTx records. supported topics:

uniswap v2 Swap(address,uint256,uint256,uint256,uint256,address)
uniswap v3 Swap(address,address,int256,int256,uint160,uint128,int24)

balancer v2 and curve are stubs at this stage.

classify

pure functions over SwapTx lists ordered by tx_index. each rule has a numbered id; ids appear in docs/heuristics.md and on the event itself via event.notes.

score

rule-based confidence in [0.0, 0.95]. capped on purpose. the breakdown is returned to the caller so every confidence value is fully traceable.

ai filter

optional. only used when ANTHROPIC_API_KEY is set. the filter reviews edge-case events and returns a minor adjustment, never an upgrade beyond the cap. when disabled, the rule-based score is final.

sinks

three sinks ship in 0.1:

stdout — rich-formatted terminal output
jsonl — append-only file at the configured path
webhook — POST to an arbitrary url, failures swallowed

all runs every configured sink. add your own by implementing an emit method that takes (event, score).

glossary

domain terms used throughout hakiri.

backrun: an arbitrage transaction placed immediately after a known price-moving transaction.
builder: the entity that constructs a block. on ethereum mainnet under proposer-builder separation, the proposer (validator) outsources block-building to one of a small set of public builders.
bundle: an ordered group of transactions submitted together. searchers pay builders to include their bundle at a specific position.
coinbase transfer: an in-execution transfer of eth to block.coinbase. searchers use this to pay the builder for inclusion priority. presence and size are the strongest single signal that a transaction is part of an mev bundle.
confidence cap: hakiri scores never exceed 0.95. a detector, not an oracle.
flashbots relay: a public mev-boost relay forwarding builder bids to validators. one of several relays hakiri pulls payload data from.
jit (just-in-time liquidity): adding liquidity to a pool, capturing fees from one specific incoming swap, and removing the liquidity in the same block.
mempool: the pool of transactions broadcast but not yet included. hakiri reads it to detect bundles before they land.
pbs (proposer-builder separation): the post-merge ethereum architecture where validators receive bids from builders and choose the highest-paying block.
relay: a service relaying bids between builders and validators. flashbots, ultrasound, and titan are the most-used public relays.
sandwich: a front-run + back-run pair surrounding a victim swap on the same pool. the searcher buys before, sells after, captures the spread the victim creates.
searcher: a bot operator submitting bundles. mev-boost separates searchers from validators.
tx_index: the position of a transaction within its block. ordering matters for bundle reconstruction.
verdict: hakiri's human-readable label for a scored event: confirmed (≥0.85), likely (≥0.65), suspected (≥0.40), noise (<0.40).

roadmap

version	scope	status
v0.1	sandwich + backrun rules. uniswap v2/v3 decode. demo + cli.	shipped
v0.2	rust ingest wired via pyo3. jit + atomic arb rules. fixture replay.	now
v0.3	liquidation + oracle rules. balancer + curve. base + arbitrum support.	planned
v0.4	per-searcher leaderboard. per-builder coinbase share. weekly digest.	planned
v0.5	reth `mev` namespace integration. low-latency local node mode.	planned

contributing

short version: ship code that passes ci, write a clear pr description, no llm-generated readmes.

new searchers and builders are the easiest path in. open a PR against src/hakiri/enrich/builder.py or src/hakiri/enrich/searcher.py with on-chain evidence in the description.

full guidelines: CONTRIBUTING.md.