How I Built an Automated Jigsaw Puzzle Solver for Tanggle.io (With Claude's Help)

Adrian Chrysanthou8 min read
screenshot of puzzle being solved by a python bot

Tanggle.io is a multiplayer online jigsaw puzzle platform where players collaborate to assemble puzzles in real time. I wanted to build a bot that could solve any puzzle automatically, hundreds of pieces, any image, any size. How we went from "let's use OpenCV" to "let's just talk directly to the game server” will be explored below.


Phase 1: The Computer Vision Approach (Spoiler: It Failed)

The Plan


My initial idea was straightforward:

  1. Open the puzzle in a browser using Playwright
  2. Screenshot the canvas
  3. Use OpenCV to detect individual puzzle pieces
  4. Match each piece to the reference image using color/feature matching
  5. Drag pieces to their correct positions with mouse automation


Claude helped me scaffold the entire project as a Python package with Playwright for browser automation, OpenCV for image analysis, and a CLI to tie it all together.

Within minutes, we had a working browser that could navigate to tanggle.io, log in, and take screenshots.


Problem 1: Cloudflare Blocks Automated Browsers


The first roadblock: tanggle.io uses Cloudflare Turnstile protection. Even with Playwright's bundled Chromium, the Cloudflare checkbox would fail every time, even when clicked manually.


The fix: Instead of Playwright's Chromium, we launched the user's actual installed Chrome channel="chrome" with a persistent profile directory. Cloudflare can't distinguish this from a normal browsing session. We also made the login semi-manual (unfortunately); the solver pre-fills your credentials and waits for you to complete the Cloudflare challenge. After the first login, the session persists across runs.


Problem 2: PixiJS Renders Everything on Canvas


Tanggle.io uses PixiJS, a WebGL rendering engine. The puzzle isn't made of DOM elements you can inspect. It’s all pixels on a <canvas>. We couldn't access the PixiJS application object either, because it's bundled inside Next.js closures and not exposed on window.


What we tried: Searching window for objects with .stage and .renderer properties, checking __PIXI_APP__, canvas.__pixiApplication. None worked; the app was hidden within a module closure.


Problem 3: Mouse Drag Pans the Viewport


This was the killer. Tanggle.io's canvas supports pan and zoom. When you click-and-drag on empty space, it pans the entire viewport. Our solver would:

  1. Correctly match a few high-confidence.
  2. Try to drag a piece that was slightly misdetected
  3. Miss the piece, grab the empty space instead
  4. Pan the entire puzzle off-screen
  5. All subsequent coordinates become meaningless.
  6. The puzzle literally disappears

We tried click-click (pocket mode) instead of drag, but tanggle requires press-and-hold to grab pieces. We tried resetting the viewport periodically, but the damage was already done. Every missed drag cascaded into total failure.

Problem 4: Frame Detection Breaks on Every Puzzle

Detecting the puzzle frame (the rectangle where pieces snap into place) required different approaches for:

  • Fresh puzzles: empty dark frame surrounded by colorful pieces
  • Partially solved: bright frame with image content
  • Different zoom levels: the frame could be any size on screen (ughhhh)

At this point, the CV approach was fundamentally limited. We needed a different strategy entirely.

Phase 2: The WebSocket Breakthrough

The Key Insight

Tanggle.io is a multiplayer game. The server knows everything: every piece's position, every move, every snap. All this data flows through a WebSocket connection. If we could intercept and decode that traffic, we wouldn't need computer vision at all.


We already had a WebSocket hook in the browser (from an earlier attempt to capture game state).

Hooking the Existing WebSocket

Claude suggested patching WebSocket.prototype.send instead of the constructor. This is a prototype-level hook; it intercepts every send() call on any WebSocket instance, even ones created before our code ran. The moment the game sends its next message (cursor move, heartbeat, anything), our hook activates and starts capturing both directions.

const origSend = WebSocket.prototype.send;
WebSocket.prototype.send = function(data) {
    // Auto-hook this WebSocket instance on first send
    if (!this.__hooked) {
        this.__hooked = true;
        this.addEventListener('message', captureIncoming);
    }
    captureOutgoing(data);
    return origSend.call(this, data);
};


Decoding the Protocol


We built a capture command that records all WebSocket traffic while you manually interact with the puzzle. The first capture gave us 413 messages. The raw bytes started with values such as 147, 146, and 153. I mentioned this to Claude, and it immediately recognized MessagePack as a binary serialization format.

One pip install msgpack later, we could decode every message:

# The initial game state — every piece's position!
[1, {
    'uuid': '97e6a2e8-...',
    'meta': [29, 35],           # 29 columns, 35 rows
    'border': [-2853, -2853, 4303, 4603],
    'pieces': [
        [0, -625, 20, False, 0, None, None, None, 0],   # piece 0 at (-625, 20), unplaced
        [1, 1715, 560, False, 0, None, None, None, 0],   # piece 1 at (1715, 560), unplaced
        ...
    ]
}]


We had the complete game state. Every piece ID, every position, the grid dimensions, the board boundaries. No screenshots needed.

Reverse-Engineering the Move Protocol


By capturing traffic while manually moving pieces, we decoded the full interaction cycle:

| Message | Meaning |
|---------|---------|
| [1, 1] | Mouse down |
| [2, piece_id, 0, 20] | Pick up piece |
| [0, x, y] | Move cursor |
| [4, x, y, neighbor_id, None] | Drop piece near neighbor |
| [1, 0] | Mouse up |

The critical discovery: the 4th element of the drop message specifies which piece you're connecting to. When we initially sent [4, x, y, None, None] the pieces moved but never snapped together. The server needs to know which neighbor you're targeting to trigger the snap logic.

Also discovered that when a piece successfully snaps, the server responds with a negative group ID (like -5, -6). When it doesn't snap, it responds with None.

Building the Protocol Solver

The WebSocket solver is elegant in its simplicity:

  1. Read game state from the initial WebSocket message
  2. Compute target positions for each piece based on grid dimensions
  3. BFS traversal from piece 0 outward, each piece references its already-placed neighbor
  4. Send protocol commands directly via WebSocket. No mouse simulation at all
async def _place_piece(self, piece, neighbor_id):
  await self._send_ws([1, 1]) # mouse down
  await self._send_ws([2, piece.piece_id, 0, 20]) # pick up
  await self._send_ws([0, piece.target_x, piece.target_y]) # move
  await self._send_ws([4, piece.target_x, piece.target_y, neighbor_id, None]) # drop + snap
  await self._send_ws([1, 0]) # mouse up

Phase 3: Fine-Tuning

  • Rate Limiting
    • Our first successful run placed 37 pieces before the server disconnected us. We were sending messages too fast. About 33 messages per second. Increasing the delay between pieces from 0.15s to 0.5s solved this.
  • Cell Size Calibration
    • The default cell size (52 game units) was determined by analyzing positions of adjacent snapped pieces in captured traffic. This is configurable via --cell-size for puzzles where it differs.
  • Persistent Sessions
    • Using Chrome's persistent profile means you only need to complete the Cloudflare challenge once. The session survives across solver runs, making it practical for solving many puzzles in sequence.

What Claude Brought to the Table


Throughout this project, Claude served as a real-time pair programmer. Here's what stood out:

  • Pattern recognition in binary data. When I shared raw WebSocket bytes starting with [147, 146, 153...], Claude immediately identified MessagePack. This saved hours of guessing at the serialization format.
  • Protocol analysis. Claude traced through captured messages to identify the pickup/move/drop cycle, the snap mechanism (neighbor_id in the drop message), and the server response format (negative group IDs = successful snap).
  • Architecture decisions. The BFS placement order (so each piece can reference an already-placed neighbor), the prototype-level WebSocket hook (to catch existing connections), and the persistent Chrome profile (for Cloudflare).

The Final Architecture

tanggle_solver
├── main.py # CLI: solve, capture, logout
├── config.py # .env credential loading
├── browser.py # Playwright + WebSocket hooking
├── ws_solver.py # Protocol-based solver
└── protocol.py # Traffic analyzer


Dependencies:


How it works:

  1. Launches your real Chrome (bypasses Cloudflare)
  2. Patches WebSocket.prototype.send to intercept game traffic
  3. Decodes MessagePack game state (piece positions, grid dimensions)
  4. Computes target coordinates for each piece
  5. Sends pickup/move/drop commands via WebSocket in BFS order
  6. Server snaps pieces into place

Lessons Learned

  • Start with the protocol, not the pixels.
    • For any web game, the server communication is a more reliable data source than screenshots. Binary protocols look intimidating, but are often just MessagePack or Protobuf, one library import away from readable data. :)
  • Persistent browser profiles are essential
    • For sites with bot detection. Using the user's actual Chrome installation with a persistent profile is nearly undetectable.
  • Rate limiting exists even for WebSocket.
    • Games will kick you for sending messages too fast. A 0.5s delay between moves is the sweet spot, fast enough to solve a 1000-piece puzzle in ~10 minutes, slow enough not to trigger anti-abuse.
  • Prototype patching beats constructor wrapping.
    • If you need to intercept an already-established WebSocket, patching WebSocket.prototype.send catches everything, past, present, and future connections.
  • AI pair programming shines at protocol reverse-engineering.
    • The back-and-forth of "here's the raw data" → "that's MessagePack" → "here's what the fields mean" → "here's the code" was incredibly productive. What would have taken days of solo debugging took hours with Claude.

Screenshot of tanggle bot solving a puzzle
Tanggle bot solving a puzzle


Try It Yourself


The project is open source: https://github.com/f00d4tehg0dz/tanggle-bot-solver

pip install -r requirements.txt 
playwright install chromium
python -m tanggle_solver.main solve <puzzle-uuid>

Fair warning: use responsibly. This was built as a learning exercise in protocol reverse-engineering and browser automation, not to ruin anyone's puzzle experience.


Built with Claude Code (Anthropic's CLI for Claude) as my pair programmer throughout the entire development process.