Chủ đề thịnh hành
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
Good data points on the importance of "context engineering":
Input tokens may be cheaper than output tokens, but context heavy tasks (like coding) can require 300-400x more input tokens of context than output tokens, making context 98% of total LLM usage costs.
Latency also grows w/ larger context size.
Underscores the importance of providing the right context at the right time when building AI applications, and, I assume, leaves a lot of room for competitive differentiation in AI-navtive SaaS apps.

9 thg 7, 2025
When you query AI, it gathers relevant information to answer you.
But, how much information does the model need?
Conversations with practitioners revealed the their intuition : the input was ~20x larger than the output.
But my experiments with Gemini tool command line interface, which outputs detailed token statistics, revealed its much higher.
300x on average & up to 4000x.
Here’s why this high input-to-output ratio matters for anyone building with AI:
Cost Management is All About the Input. With API calls priced per token, a 300:1 ratio means costs are dictated by the context, not the answer. This pricing dynamic holds true across all major models.
On OpenAI’s pricing page, output tokens for GPT-4.1 are 4x as expensive as input tokens. But when the input is 300x more voluminous, the input costs are still 98% of the total bill.
Latency is a Function of Context Size. An important factor determining how long a user waits for an answer is the time it takes the model to process the input.
It Redefines the Engineering Challenge. This observation proves that the core challenge of building with LLMs isn’t just prompting. It’s context engineering.
The critical task is building efficient data retrieval & context - crafting pipelines that can find the best information and distilling it into the smallest possible token footprint.
Caching Becomes Mission-Critical. If 99% of tokens are in the input, building a robust caching layer for frequently retrieved documents or common query contexts moves from a “nice-to-have” to a core architectural requirement for building a cost-effective & scalable product.
For developers, this means focusing on input optimization is a critical lever for controlling costs, reducing latency, and ultimately, building a successful AI-powered product.




1,57K
Hàng đầu
Thứ hạng
Yêu thích