Lets get straight to the point, Claude is getting way too stingy with token and its usage limit. you run a few commands and your context window is gone because Claude wants to write a whole essay before giving you the code.
The fix is pretty easy though. Just use this stack: Caveman, RTK, and FFF. It cuts the token drain and keeps things moving. Here is the setup:
- Caveman: LLMs are naturally super polite and loves to over-explain. this just forces it to drop the yapping and give you the raw code.
- RTK (Rust Token Killer): handles the terminal noise. dumping giant error logs into the prompt eats tokens fast. RTK filters out the boilerplate output before the LLM even reads it.
- FFF (Fast Fuzzy Finder): stops context waste. instead of the agent running clunky greps and pulling irrelevant file text into memory, FFF helps it find the right files instantly. Stop wasting tokens on boilerplate. Run these three together and your sessions will actually last. Less reading, more shipping. *but actually they are smart enough to search things they wanted with simple grep or fzf.
Thats really it, saved my usage limit massively. I actually feel like I am getting my money's worth out of the pro sub now instead of getting throttled halfway.
and a few updates: I now use Gemini Pro for architecture-ing system or a feature. And Claude for implementing everything.