by attogram ·
The story
https://zenodo.org/records/21000879
https://zenodo.org/records/21000879/files/the_copyright_pupp...
More later...
by chambertime ·
I'm the CTO at System1. We own Dogpile.com. It has a diehard userbase, but it has been coasting for some time now.
We recently rebuilt the backend as a metered API with a corresponding MCP server for agents.
The original Dogpile thesis was that no single index sees the whole web. There's an old Penn State / Pitt study that found ~85% of top results were unique to one engine. Full disclosure: Dogpile sponsored it and of course, that was a totally different web. But the underlying question feels relevant again. Cursory analysis we’ve done internally shows that Brave, Google, and Bing all surface different results for the same query.
For humans, missed results are annoying.
For agents doing RAG or grounding generated content, missed or low-quality results can become a failure mode with no human in the loop to catch it.
Current state:
- Search API + MCP server (npx @system1/dogpile-mcp)
- Structured JSON output
- 3 types of responses. Basic is just what you’d get on a serp. Enhanced scrapes the result pages and includes the text as markdown. Dee performs deep research running multiple queries providing a summary.
- $0.50 monthly credit, no credit card required
- Scaling starts at $2/1k calls
Known limitations:
- It's single-source today so it is not true metasearch yet
- The roadmap is multi-source aggregation, but we wanted to get this into people’s hands as we continue to build out the feature set
- The API contract is designed so adding sources later will not require client changes
- Basic calls are landing around 800ms, but we don't have serious production concurrency numbers or p99s under heavy loads yet
Here’s a question for this group:
What features or data shapes are you missing from existing agentic search tools when building agent pipelines?
When you have multiple MCP servers, every request to the LLM will include all of their tools and descriptions, which can quickly eat up your token limit and increase costs. The thing is, most of the time, you don't need all of them.
For example, let’s take three popular MCP servers: Notion, GitHub, and Pylance. The overhead they create on every turn is about 26K tokens. If we assume an average 50-turn coding session and Opus pricing, the overhead for a single session is about $0.9275.
`mcp-compress-router` does something very simple: it proxies all MCP servers with just two tools: `get_tool_schema` and `invoke_tool`. `invoke_tool` proxies the call to the downstream MCP server. The `get_tool_schema` description lists the tool names and arguments for all downstream MCP server tools so that the agent knows what's available. Whenever it needs a tool, it first calls `get_tool_schema` to read the full description and argument schema, and then calls `invoke_tool`.
The savings are pretty serious. The example of 3 MCP servers is compressed to 900 tokens with the "max" compression level (just tool names), or to about 2000 tokens with the "high" compression level (the default one: tool names plus argument names). So you'll be saving 90%+ this way.