Mark As Completed Discussion

Exchange Connectivity and Protocols

Understanding how your system connects to exchanges is foundational for any HFT engineer — like knowing the court, the ref, and the scoreboard before you run plays. This screen gives a practical intro to the most common protocols (FIX, OUCH, ITCH, and binary multicast), how messages are parsed/serialized, and which tools/libraries help you test connectivity.


Why this matters (microsecond mindset)

  • Exchanges speak different "languages": some send text-based order messages (FIX), others push high-rate market data as binary multicast (ITCH).
  • Parsing/serializing correctly and recovering from gaps (sequence numbers) is critical: a missed packet = a missed trade.
  • For beginners in C++, Python, Java, C, JS: start by understanding message shape and invariants (lengths, checksums, sequence fields) before optimizing for latency.

Quick protocol cheat-sheet

  • FIX (Financial Information eXchange)

    • Text protocol: tag=value pairs separated by ASCII SOH (0x01).
    • Common for orders/trades over TCP. Libraries: QuickFIX (C++), QuickFIX/J (Java), quickfix Python wrappers.
    • Analogy: a referee announcing plays over a PA system — human-readable, reliable, and standardized.
  • OUCH

    • Exchange-specific binary order protocol (example: NASDAQ OUCH for order entry).
    • Binary, fixed-length fields, compact and fast — think of a coach's shorthand playbook.
  • ITCH / binary multicast

    • High-throughput, low-latency multicast for market data updates (adds, trades, deletes).
    • Messages are compact binary records; you often map a memory buffer and parse in-place.
    • Analogy: a fast live video feed — many frames per second, you must keep up or fall behind.
  • binary multicast general notes

    • No retransmit: if you miss a multicast packet, you must detect gaps (sequence numbers) and request a resend from a TCP replay or use snapshots.
    • NIC features (hardware timestamping, RSS queues) and kernel-bypass (DPDK, PF_RING) become relevant here.

Parsing & serialization: practical rules of thumb

  • Always validate lengths and sequence fields before trusting payload.
  • For FIX:
    • Split on SOH (0x01). The 9= (BodyLength) and 10= (Checksum) fields help detect corruption.
    • Implement a small, robust parser first (proof-of-concept in C++ or Python) before pushing to low-latency optimizations.
  • For binary protocols (OUCH/ITCH):
    • Define exact struct layout, prefer reading fixed fields (no string ops in the hot path).
    • Use big-endian vs little-endian correctly (spec doc will say).
  • Sequence recovery: maintain last seen sequence ID, detect gaps, and trigger snapshot/recovery logic.

Tools & libraries to test connectivity

  • FIX: QuickFIX (C++/Python/Java), test with tcpdump, Wireshark (FIX dissector), and simple test clients.
  • Binary multicast: use tcpreplay, pktgen, nping or vendor simulators to replay captures; Wireshark with ITCH dissector to inspect.
  • Generic: tcpdump, tshark, pcap, netcat/socat for simple TCP tests; iperf/netperf for bandwidth; strace/perf for profiling.

ASCII diagram (data-flow simplified)

Exchange multicast (ITCH) --> Fiber --> NIC --\ | \ v > Your market-data handler (in C++/DPDK/PF_RING) Exchange TCP (FIX/OUCH) --> Fiber --> NIC ----/ (parsing, seq. recovery, order gateway)


Hands-on: C++ playground (parse a FIX string and a simulated binary multicast)

Below is a small, beginner-friendly C++ program that:

  • Parses a simple FIX message (splits tag=value by SOH).
  • Builds and parses a small simulated binary multicast packet (an ITCH-like layout).

Notes for you coming from Python, Java, C, or JS:

  • This is intentionally simple: it shows the core idea of tokenizing (like JavaScript's split) and byte decoding (like reading an ArrayBuffer in JS).
  • Modify the sample messages (change symbol from the playful KB24 — a Kobe Bryant nod — to your favorite symbol) and recompile to see how parsing changes.

Challenge for you:

  • Add checksum verification for the FIX message (10= field) in the C++ code.
  • Simulate a missing sequence number in the binary packet and print a warning for gap detection.
  • If you prefer Python: reimplement parse_fix in Python quickly to feel the contrast in allocation/cost.

Now compile and run the C++ code in the code pane. After running, try the challenges above.

CPP
OUTPUT
:001 > Cmd/Ctrl-Enter to run, Cmd/Ctrl-/ to comment