トレンドトピック
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
「コンテキストエンジニアリング」の重要性に関する優れたデータポイント:
入力トークンは出力トークンよりも安価かもしれませんが、コンテキスト負荷の高いタスク(コーディングなど)は、出力トークンよりも300〜400倍のコンテキストの入力トークンを必要とする可能性があり、コンテキストはLLMの総使用コストの98%になります。
また、コンテキスト サイズが大きくなると、待機時間も長くなります。
AIアプリケーションを構築する際には、適切なタイミングで適切なコンテキストを提供することの重要性を強調しており、AI海軍のSaaSアプリには競争上の差別化の余地がたくさんあると私は考えています。

2025年7月9日
When you query AI, it gathers relevant information to answer you.
But, how much information does the model need?
Conversations with practitioners revealed the their intuition : the input was ~20x larger than the output.
But my experiments with Gemini tool command line interface, which outputs detailed token statistics, revealed its much higher.
300x on average & up to 4000x.
Here’s why this high input-to-output ratio matters for anyone building with AI:
Cost Management is All About the Input. With API calls priced per token, a 300:1 ratio means costs are dictated by the context, not the answer. This pricing dynamic holds true across all major models.
On OpenAI’s pricing page, output tokens for GPT-4.1 are 4x as expensive as input tokens. But when the input is 300x more voluminous, the input costs are still 98% of the total bill.
Latency is a Function of Context Size. An important factor determining how long a user waits for an answer is the time it takes the model to process the input.
It Redefines the Engineering Challenge. This observation proves that the core challenge of building with LLMs isn’t just prompting. It’s context engineering.
The critical task is building efficient data retrieval & context - crafting pipelines that can find the best information and distilling it into the smallest possible token footprint.
Caching Becomes Mission-Critical. If 99% of tokens are in the input, building a robust caching layer for frequently retrieved documents or common query contexts moves from a “nice-to-have” to a core architectural requirement for building a cost-effective & scalable product.
For developers, this means focusing on input optimization is a critical lever for controlling costs, reducing latency, and ultimately, building a successful AI-powered product.




1.57K
トップ
ランキング
お気に入り