Product Hunt 每日热榜 2026-05-19

PH热榜 | 2026-05-19

#1
PollyReach
Give your agent a real number and voice to make calls.
464
一句话介绍:PollyReach为个人用户提供AI驱动的真实电话号与语音,解决预约、被咨询等需电话沟通的痛点,支持50+语言。
Productivity Artificial Intelligence Virtual Assistants
AI电话助手 语音代理 电话预约 智能呼叫 垃圾电话过滤 多语言 语音交互 个人助理 呼叫自动化 真实号码
用户评论摘要:用户关心AI能否处理真实通话中的插话、等待、意外情况,要求看到失败案例而非仅成功演示。呼声较高的需求包括:通话记录可路由至Webhook、支持现场转接、保留与同一商家通话的上下文记忆。产品团队承诺呼出时AI会自报身份,且严格报告呼叫结果(确认/未接等),不虚报。
AI 锐评

如果说硅谷的AI语音公司在忙着给企业画CRM宏图的饼,PollyReach则在干一件更“俗”但也更必要的事:帮你给饭店打订位电话。它的切入点精准得近乎刁钻——AI能跨语言流利交流,但绝大多数此类产品仍旧困在“呼叫中心即服务”的B端高墙内,而普罗大众真正痛的是“看到电话号码就烦躁,宁可发信息也不愿开口”这一共情。

从产品形态看,PollyReach聪明地避开了“全功能自动化保姆”的陷阱:不谈取代人类销售,只说帮你预约、帮你过滤垃圾电话。评论中创始人坦诚日语和英语是主力语言,且不回避日本餐厅无人接听、对话偏离脚本等“不完美但真实”的edge cases,这种表态比那些宣称“AI完美理解一切”的PPT可信得多。但问题同样明显——多语言质量“ongoing”意味着横向扩展尚不成熟;每个电话需要事前授权而非实时转接,说明在复杂场景下的决策边界还很狭窄。

最值得留意的是对“滥用”场景的回应:每通电话都自报AI身份、多层防线防范欺诈——这在欧美严苛的电话合规环境下是保命线,而非可有可无的装饰。对于想做个人助手的玩家,与其烧钱拼大模型多模理解,不如像PollyReach一样,先把“等待转接时不挂断”这种毛细血管级体验的精度做透。产品有真实场景,但不宜过早吹捧“颠覆电话沟通”,因为只解决“敢拨号”而尚未完全解决“能搞定”。对于频繁被机构客服、物业公司、海外签证中心电话折磨的用户,这200免费积分值得一试,但别指望它能替你求完银行调额。

查看原始信息
PollyReach
Most AI phone tools are built for enterprises — APIs, workflows, sales automation. PollyReach is built for you. Give your AI a real phone number. Say "book me a table for 7pm" — it finds the number, makes the call, handles the conversation, and reports back with a summary, recording & transcript. It also answers your phone 24/7 and screens spam. Works in 50+ languages.

Hey Product Hunt! 👋

I'm Gia, maker of PollyReach.

I'm the kind of person who'd rather text than call. But some things still require a phone call — and when the restaurant is full, you have to try the next one, and the next.

In Japan, I wanted to book a small izakaya. No online reservation, just a phone number. I don't speak Japanese.

That's when it clicked: AI speaks 50+ languages. Why can't it just make the call?

So we built PollyReach — Your Agent gets its own real phone number and a real voice. It handles real conversations the way a personal assistant would — gets interrupted, responds naturally, waits on hold, navigates IVR menus, and knows when to push back and when to hang up.

It calls for you

  • Polly finds the number, dials, navigates phone menus, handles the conversation, and confirms the booking. You get a summary + recording + transcript.

It answers for you

  • Polly picks up 24/7 with a natural-sounding voice — screens spam, talks to real callers, takes messages, tags priority, and sends you a summary. You decide what's worth calling back.

It works for your business

  • One of our users manages 80+ rental properties — his AI assistant handles tenant calls, follows up with vendors, and sends him a daily report.

Get started — just send this to your agent:

clawhub install pollyreach

Everyone here gets 200 free credits + a free phone number. Try it and let me know what your first call is.

Really excited to share this with the PH community. We'll be around all day answering questions! 🙌

18
回复

@gia_xu Cool stuff. Is the beta version live?

0
回复

@gia_xu any plans to add webhooks for automatic transcript routing?

3
回复

@gia_xu I can see this being really useful for stuff like calling local places that never answer online, but I also wonder how awkward it gets once the conversation goes slightly off-script. Half the pain with phone calls is weird human unpredictability, not just the dialing part.

0
回复

What languages are actually solid right now? Curious about Japanese specifically since that's the founding story.

2
回复

@jocky Great question — and Japanese is absolutely our strongest, most polished language (founding story checks out!), plus a handful of others that are rock-solid right now.

2
回复

@jocky Thanks. Right now our multilingual testing is still ongoing and not fully polished yet. We’ve primarily focused on English and Japanese. Regular conversational flow works well, though we’ve noticed pronunciation clarity can be spotty for long number sequences like phone numbers and we're still working on it.

2
回复

Do you have a demo call recording somewhere? I want to hear the natural interruptions in action.

2
回复

@alexis_rodriguez7 Sure thing! We have real call demo recordings available. You can also hop onto our platform directly and use your free trial credits to place test calls yourself, and experience how it smoothly handles real-time interruptions and overlapping conversations firsthand.

0
回复

@alexis_rodriguez7 Regarding the point about interruptions, I will find a better example and post it here shortly.

0
回复

This feels like one of those products where the demo probably converts people instantly. Would love to test it on real world calls.

1
回复

@geoffrey_reed It performs extremely well in actual daily calling scenarios, from tedious customer service waits, restaurant bookings to spam call filtering. You can jump straight into real-world call tests right away, experience its smooth dialogue, long on-hold endurance and flexible response logic firsthand, and feel how much it cuts down your trivial phone work.

0
回复

@geoffrey_reed Appreciate that! Feel free to test it on real‑life calls and share your thoughts. We’re continuously iterating to improve the experience.

0
回复
Quo answers phones. Does your software make calls? Will the agent identify itself as AI without prompting? Thanks. 🙏
1
回复

@lakshminath_dondeti Yes to both — PollyReach is primarily outbound (with inbound as well), and our agent always self-identifies as an AI on every call by default, no prompt needed.

1
回复

This is one of those products where the demo can be simple but the real test is annoying real life. Booking a table sounds easy until the place is closed, the number is stale, the person asks a follow-up question, or the bot has to negotiate between "7pm" and "we only have 7:45."

I’d love to see examples of failed or partial calls, not just successful ones. A useful agent should say “I called, they didn’t answer,” “they only had 8pm,” or “I wasn’t confident enough to confirm,” instead of pretending the task completed.

1
回复

@zact Great observation — you’re absolutely right that real‑world edge cases are where these agents truly get tested, far beyond simple ideal‑case demos.

This is exactly the core problem we’ve built PollyReach around solving. We strictly report call outcomes as they actually happen, with full transparency: users get clear status labels (Confirmed · Voicemail · Declined · No answer), plus full call summaries, transcripts, and recordings for every call.

We also proactively flag potential edge‑case scenarios for users ahead of calls(e.g., 6:30–7:30 if 7 PM is unavailable), though we’re still refining how we handle tricky real‑world exceptions. Most importantly, our AI never over‑claims success or pretends a task is completed when it isn’t.

1
回复

Congratulations

1
回复

@madalina_barbu Thanks! Means a lot on launch day 🙏

0
回复

Congrats! great team!!!

1
回复

@gideon_ge Thank you so much! Really appreciate your sincere support🥰

0
回复

Aren't you afraid that some people may use it for harmful purposes? E.g. automating the frauds (against banks and its users, or "behaving like relatives")? How can we protect / avoid such an usage?

1
回复

@busmark_w_nika Genuinely good question — this is something we treat as a first-class design constraint, not a patch.

Every outbound call goes through a multi-layer guardrail before it dials. We screen for impersonation, deceptive intent, and known abuse patterns; Polly is required to identify itself as an AI on every call (so it can't pose as a person or a relative); and there's an anti-harassment layer on top.

Happy to go deeper privately if you're working on similar problems.

2
回复

Finally, an AI that can actually wait on hold instead of hanging up after 10 seconds. Take my credits.

1
回复

@antonio_manuel1 Hahaha totally agree! It can stay on hold patiently for ages without dropping the line at all. Go ahead and use your free credits to experience it right away!

0
回复

@antonio_manuel1 Haha thanks! Looking forward to seeing what you put it on.

0
回复

Does Polly learn from previous calls with the same vendor, or is each call fresh?


1
回复

@colton_drake It can learn and retain call habits with the same vendor.

0
回复

That's a very good tool. I tried to use it to book a restaurant for me and guess what? it succeed! Damn good.

1
回复

@huanghaosteven Wow that’s awesome!

0
回复

Congrats!
Quick question: Can Polly transfer a live call to me if it realizes it's out of its depth?


1
回复

@asher_luca Great question! That’s a really key point.

Right now, before each call, our agent pre‑plans for edge‑case scenarios and confirms authorization with you upfront. If during the call it encounters questions it cannot answer or sensitive information, we will end the call first and notify you afterward. We are also exploring a feature where Polly proactively calls you when it’s out of its depth. We welcome more discussions and feedback — it’s extremely helpful for us!

0
回复

How many simultaneous calls can one Polly agent handle? Asking for the property manager.


1
回复

@owen_shaw2 A single Polly agent can stably handle up to 10 simultaneous calls smoothly.

It works perfectly for property management teams to deal with inquiries, follow-ups and routine service calls in bulk.

0
回复

Is there an API or MCP for developers? Would love to plug this into my own agent.

1
回复

@julie_c_wang Yes, we officially provide both API and MCP access for developers. Feel free to send us a private message if you need access and detailed integration docs. You can seamlessly connect our calling capabilities into your own AI agents and custom systems with ease.

1
回复

@julie_c_wang Absolutely! We provide a skill for AI agents. Developers can easily install it via clawhub with one command: clawhub install PollyReach. Alternatively, you can simply send this link to your agent, and it will finish the installation automatically: Read https://www.pollyreach.ai/SKILL.md and follow the instructions to install PollyReach.

1
回复

Congrats on your launch! I have a small question: Does it integrate with anything yet (calendar, email, Slack), or is it standalone for now?

1
回复

@crystalmei It already supports email notifications now.Once the AI finishes a call, all call results, booking details and key conversation summaries will be automatically sent to your bound email inbox for easy viewing and record keeping.

0
回复

@crystalmei Thanks and great idea! We are also considering integrating with calendars in our roadmap.

1
回复

One scene keeps playing in my head: strolling through the streets with an AI companion, like the movie Her finally stepping into real life. She just gets it, makes the call, and the table is already waiting when we show up.

PollyReach made that dream real. So cool!

1
回复

@genedai We truly built PollyReach to turn this soft futuristic daily dream into tangible reality. No hassle, no awkward calls, just peaceful moments and perfectly arranged plans. Such a wonderful feeling to make this lifestyle come true ✨

0
回复

Congrats! curious can you give it specific instructions before a call, like "only book if there's outdoor seating available"?

1
回复

@carlvert Absolutely you can! That’s one of our most practical core features.

0
回复

Where do you find the number? Actually, I want to know the limit. What kind of tasks would be impossible to complete?

1
回复

@charlenechen_123 
You can easily get and manage calling numbers directly inside the PollyReach platform.

As for usage limits:

We have reasonable daily outbound call volume limits to ensure stable service quality.

In general, simple business outreach, customer follow-up and regular calling tasks are all fully supported.

Only high-frequency bulk spam calling, massive robocalls and overly intensive continuous dialing will be restricted and cannot be completed normally.

0
回复

@charlenechen_123 Thanks for your great question! We’ve integrated tools to source these phone numbers. Lots of government and business numbers are publicly available. At the moment, we don’t support real‑time payment during calls yet, so our agent can’t process deposits or payments mid‑call.

0
回复

Interesting concept. My main question is how it handles calls where the other side gets suspicious or asks to speak to a real person — does it disclose that it's an AI? Would love to know more about how that's handled before fully committing.

1
回复

@luke_pioneero Great question, this is fully taken care of in our workflow.

First off, our AI is programmed to act naturally in conversations. Whenever the recipient grows suspicious, doubts the identity or directly requests to talk to a real human agent, the system will respond properly right away.

0
回复

@luke_pioneero Great question!! Yes, we will state at the beginning that we are an AI assistant representing the owner. If the caller explicitly refuses to communicate with AI, we will not follow up further and will notify the owner about the situation. For enterprise‑level clients, we support transferring calls to live human.

0
回复

Loving this
The free phone number + credits to test is the right move, i been waiting for something like this that doesn't require API setup. Just installed 🎉

1
回复

@abod_rehman So glad you love it! 🎉

0
回复

@gia_xu is it possible to connect it to an existing, real phone number so it can serve as my "assistant"?

0
回复

@hunter_powabase Great question! We have received multiple user requests for it and we’re working on it. We’ll roll out this feature soon!

0
回复

How does it handle situations where the restaurant asks for a credit card to hold the reservation — does the AI pause and hand off to you, or is that not in scope yet?

0
回复

@hirogure It won’t input or confirm any payment, card info or deposit details on its own

0
回复

@hirogure That is a very sharp and important question. When payment‑related information is involved right now, the AI will state it cannot handle such requests and notify the owner after the call.

We’ve been closely evaluating this feature, but we take personal sensitive and payment data very seriously. We want to find the right approach and timing before supporting these high‑privacy scenarios.

0
回复

From your terms:

  • Prior Express Consent for non-marketing automated calls

What does that involve? How would I get consent prior to calling a restruant like in your example?

0
回复

它不仅成功打通电话(餐厅不接受在线预订),还能灵活应对 "只有吧台座位" 的情况,帮我争取到了 7 点的最佳位置,全程用自然的日语对话,餐厅完全没发现是 AI 在沟通

0
回复
0
回复

Hi! How does Polly handle conversations that start to become negative or unproductive?

0
回复

@nisa_meray Polly has built-in sentiment detection to manage negative & unproductive chats smartly

0
回复
I love this, for someone who has struggled with this exact problem. I could definitely see the use case for this when planning a vacation and wanting to book reservations ahead of time.
0
回复

@marco_pott Totally get it!

0
回复

@marco_pott Thank you so much! We’re thrilled this resonates with you. Planning vacation bookings ahead of time is exactly one of our core real‑world use cases, and we really appreciate your support.

0
回复

Feels less like one of those robotic AI callers and more like an actual assistant handling stuff for you.

The “book me a table” use case is honestly super relatable 😂

0
回复

@campritchard Couldn’t agree more!

0
回复

@campritchard Appreciate this feedback! We built PollyReach to sound like a real helpful assistant, not a rigid robot. Restaurant bookings really hit home for most people 😂

0
回复

Congrats on launching this useful tool! I'm curious whether the numbers PollyReach has would be considered valid ones. I know many cell phones would automatically filter seemingly suspicious calls, so I would like to make sure the calls will actually get through.

0
回复

Hello - what sort of compliance do you have right now? Can it be used in healthcare?

0
回复

@its_maddy_a We follow standard global data privacy regulations including GDPR, CCPA and local telecom call rules. All call recordings, transcripts and user data are fully encrypted with strict access control.

0
回复

@its_maddy_a Great question! We haven’t prepared for healthcare‑related use cases yet, but we’d consider building support if there are suitable real‑world scenarios. Could you share more about your use case and the specific problems you’re hoping to solve?

0
回复
#2
Drizz
Mobile tests that write, run, and fix themselves
382
一句话介绍:Drizz是一款基于Vision AI的移动端测试自动化平台,让开发者用自然语言描述测试意图,即可在真实设备上自动生成、执行并自我修复测试用例,彻底摆脱脚本编写和选择器维护的痛点。
Developer Tools Artificial Intelligence No-Code
AI测试自动化 移动端测试 视觉AI 自然语言测试 无脚本测试 CI/CD集成 自我修复测试 真实设备测试 iOS/Android 质量保障
用户评论摘要:用户核心关切:Vision AI的确定性(担心误判和自愈行为)、审计追溯能力、多步骤复杂流程处理。正面反馈集中在免脚本、真机运行和团队响应速度。建议优化新手引导(第三步)。创始人承诺95%+CI可靠性和可修改保存测试步骤。
AI 锐评

Drizz的标语“Mobile tests that write, run, and fix themselves”听起来像是一个完美的终结者,但残酷的现实是:AI测试工具最大的敌人不是代码,而是信任。用户Ferdi的评论堪称犀利——Vision AI的“自愈”可能变成“自我欺骗”,一个模型在周二和周三对同一屏幕的不同解读,比固定的XPath错误更难调试。创始人Yash用“结构化可靠性层”和“不重新解释成功定义”来回应,这是正确的方向,但关键在于执行细节:置信度阈值如何设定?失败回放机制能否100%复现?对于监管合规用户提出的审计追查需求,当前“记录agent动作和推理”的回答还不够具体——需要可导出的时间戳、版本快照和操作日志。

Drizz的真正价值在于将测试从“工程负担”降级为“产品思维”。当设计师移动按钮12像素不再导致3小时调试,当“搜索西班牙住宿”这种业务语言直接变成可运行的测试,它确实解决了测试成本与软件复杂度成正比的行业顽疾。但有两个潜在陷阱:一是“自然语言”并非万能,复杂多步骤流程(如基于购物车状态的结账)的意图描述本身就需要深厚业务知识;二是对已有Maestro+Claude CI/CD管道的团队,Drizz需要证明它不是另一个“额外工具”,而是能秒级替换现有链条的“超级节点”。

最后,创始人团队来自Amazon和Coinbase的背景是加分项,但这点与用户在评论中提到的“测试是税不是工具”一样,本质上是对现有框架(Appium、XCUITest)的否定。Drizz的成功与否不取决于它能解决多少问题,而在于它能多大程度上将“测试”从一次性痛苦迁移到“持续信任”的新范式。建议团队尽快公开“幻觉率”和“CI通过率”的基准数据,并开放测试步骤的导出和版本对比功能,这是建立专业信任的敲门砖。

查看原始信息
Drizz
Drizz is an AI-powered mobile test automation platform built around intent-based testing. Simply describe what you want to test in plain English, Drizz executes it on a real device using Vision AI and automatically authors a reusable test case. No scripting, no flaky selectors, no manual maintenance. It adapts to dynamic UIs, integrates with your CI/CD pipeline, and gives your team reliable end-to-end coverage without the overhead.

Hey Product Hunt! 👋

I'm Yash, co-founder of Drizz. Along with my co-founders, Partha and Asad, we've spent the last year building the mobile testing tool we always wished existed.

💡 How it started

I once spent 3 hours debugging a broken test,  only to find a designer had moved a button 12px to the left. That was the moment I knew legacy systems was broken. Not "could be improved." Fundamentally broken.

XPaths that shatter when a UI changes. Test suites that need a dedicated engineer just to stay alive. We lived this at Amazon and Coinbase. Teams shipping slower than they should because testing was a tax, not a tool. So we built our way out of it.

🚀 One sentence in. A complete test out.

Drizz is a Vision AI mobile testing agent. You write what you want to test in plain English,  just the goal. Drizz reasons about your app, generates the full test, and executes it on a real device.

"Search for stays in Spain for 8th–10th April for one adult"

→ Drizz reads the app visually, builds every step, runs it on a real iPhone. No selectors. No scripts. No maintenance.

What Drizz does differently:

✍️ Plain English authoring: describe the goal, Drizz builds the test

🔮 Vision AI execution: reads your app like a human, not the DOM

🩹 Self-healing tests: UI changed? Drizz adapts automatically

📱 Real device execution:  iOS & Android, not simulators

🔗 CI/CD native: GitHub Actions, GitLab, Jenkins, Azure DevOps

📊 Results from teams using Drizz:

⚡ 10× faster test authoring vs Appium

🛠️ 67% less maintenance overhead

✅ 20000+ test cases automated without a single selector

🤝None of this happens without the full Drizz team

Every line of code, every design call, every customer conversation that shaped Drizz,  that's the whole team showing up, every single day. Huge love to everyone who's been in the trenches with us. You know who you are. 🧡



🙏 Our ask today

Not asking for upvotes, asking for honest feedback.

→ What does your mobile testing setup look like right now?

→ What would make switching to Drizz a no-brainer for your team?


Drop a comment. We read everyone, and it directly shapes the roadmap.

👉 drizz.dev

Thanks to @rohanrecommends for hunting us, and to the entire Product Hunt community for showing up for builders. Days like today are why we build in public. 🚀

— Yash @yash_varyani , Asad @asad_abrar1 , Zaid @zaid_ahmed_ansari , Zaid @Zaid Abdul Bari, Sreetama @sreetama_chakraborty , Partha @partha_sarathi_mohanty, & the whole Drizz team

23
回复

Thrilled to see this go live! All the best to team.

0
回复

@rohanrecommends  @yash_varyani  @asad_abrar1  @zaid_ahmed_ansari  @sreetama_chakraborty  @partha_sarathi_mohanty I’m somewhere in the middle agent-first for scaffolding and repetitive work, human-in-the-loop for architecture and risky edits. The sweet spot is when the agent moves fast but still gives me clean checkpoints to review instead of micromanaging every token.

0
回复

I'm building a couples app on iOS, and my whole testing setup right now is XCUITest on simulator + manual checking, archived to TestFlight by hand. Probably the exact cohort that'd never write Appium tests but might genuinely try this. The question I'd love your take on: with Vision AI driving execution, how do you keep tests deterministic across CI runs? Selectors fail predictably...locator doesn't exist, you fix it. A vision model can fail by misreading a screen state on one run and reading it correctly the next, which feels harder to debug. And there's a scary version of "self-healing" where the AI silently re-interprets what success means. What's the rerun / confidence story, and where do you draw that line?

Congrats on the launch!

8
回复

@ferdi_sigona - Really sharp question, and honestly the right one to ask before trusting any Vision AI system with CI.

The short answer: we've put a structured reliability layer underneath the vision execution specifically to prevent the "reads it differently on Tuesday" problem. It's not raw model inference on every step — there's a multi layer determinism mechanism that makes repeat runs behave consistently, not probabilistically. All of our customers have hit 95% + reliability on the tests in CI/CD

On self-healing — this is the part most tools get wrong, and it's something we were very deliberate about. The healer doesn't reinterpret what success means. It only engages when the UI has structurally changed around a test that was written correctly. Classic example: you wrote a test when your app had no color picker. A later build adds one. The healer identifies the missing piece, executes it and gets back on the path you defined, it doesn't decide your test meant something else. The intent you authored is treated as fixed; only the route to executing it can adapt. A single tests can have multiple healer sessions dependign on how far behind your testcases are wrt your release version

Given you're on XCUITest + TestFlight today, this would likely slot in without much friction — worth a conversation if you want to see it on a real flow.

Try Out Drizz right here, we are giving free launch day credits: https://www.drizz.dev/download-desktop-app. Reach out to us in case you face any issues.

8
回复

@ferdi_sigona Really thoughtful question! I'm not a testing expert, but from a builder's perspective — I think the key is having a confidence threshold where the AI flags ambiguity instead of silently re-interpreting. Let the human decide when confidence is below, say, 90%. Curious to hear how Drizz handles this too! 🔥

1
回复

We have regulatory requirements around audit trails. Every test execution needs to be traceable back to who ran it, when, what version of the app, and what passed or failed. Do you store execution history and is it exportable?

2
回复

@aniruddhd we do record agent's actions and reasoning, along with infra and app details. tell me more about your compliance needs

0
回复

Noob vibe-developer here, currently testing the most annoying part of the building process lol. Excited to try this out!

2
回复

Testing being the most annoying part is canonically true 😂 Drizz is literally built for vibe-developers, go have fun with it!


Tysm Nireka 💜

0
回复

@nireka Interesting. Which test scenario are you testing?

0
回复

I also liked that it runs on real devices because that’s where most testing issues actually show up 😅.

Congrats to the whole Drizz team on the launch 🚀!

2
回复

@nausad_alam EXACTLY! Simulators just don't catch what real devices do 😅

That's why we made it core to how Drizz works. Thanks so much for the love, Nausad 💜

0
回复

Can we test mobile apps on a phone/ mobile OS as well? Something like a TestFlight Log Interceptor or Android equivalent where we can test an app directly on the phone, or is it exclusively for the desktop form factor? Also, are VS-Code compatible IDEs supported?

Congrats on the launch!

2
回复

@koishore 

Yes . real device testing is fully supported. You can connect a physical iOS or Android device locally and run tests directly on it, with no SDK or agent embedded in your app. Testers can plug in their actual phone, or use emulators/simulators, and validate flows on real hardware.

On the TestFlight / log-interceptor question - for network-level visibility, network logs from device farms like BrowserStack and LambdaTest are surfaced, so you get that coverage during cloud test runs.

On form factor - Drizz Desktop runs on your computer as the authoring and control surface, while the tests themselves execute on real mobile devices or emulators. So the workflow is: author on desktop, tests run on the phone. And it isn't limited to interactive use - Drizz runs headless in CI/CD, so the same tests execute automatically on every build or pull request, not just when someone's at the desktop app.


On IDE integration - there's no VS Code extension today, but a Drizz CLI is coming, which makes it editor-agnostic out of the box: it runs in any terminal, slots into any VS Code-compatible IDE, and drops straight into CI/CD scripts. So you won't be locked to a specific editor.

2
回复

This looks like a game changing for the mobile app testing. it's been a pain to update the test cases everytime some changes happens to the ui. Writing intent and the let the drizz take over test generation and execution is amazing use of AI.
Can I modify and save the test case steps after they are generated from intent ? for example, I want to change only one specific steps form entire test steps will it allow and remeber next time whenever I write the same intent ?

1
回复

@sp_713 Hey Sunil,

Thanks for checking in. Yes you can always modify the test case that is generated by the tool and perfect it according to your need, and when you run a similar intent it will understand your preference.



0
回复

Tried Drizz last month after a friend recommended it. Worked surprisingly well on our app — the onboarding tutorial inside the product could use a polish (got a bit lost on step 3) but once you're past that it's smooth.

 

The team is super responsive on Slack. That counts for a lot this early.

1
回复

@borato_rohan1 Step 3 is officially on our list 👀 thanks for flagging. That kind of specific feedback is gold.


And really glad to hear the team's been helpful! That's something we genuinely care about.


Thanks for trying, Drizz, Rohan 💜

0
回复

This is awesome !

1
回复

@srikrishna_swaminathan Appreciate it sir !

0
回复

@srikrishna_swaminathan Glad you liked it!

0
回复

"Testing was a tax, not a tool" is a great line, and the 12px button story is painfully relatable. As a fellow builder in the automation space, I love that you went vision-first. Brittle XPaths have wasted more of my life than I'd like to admit.

Congrats on the launch, Yash, rooting for the whole team 🚀

1
回复

@varunrai the vision-first bet gets more interesting from here. Right now the heavy reasoning runs where the compute is, but the direction we're betting on is smaller, sharper models that run on the device itself — testing that's faster because it isn't waiting on a round trip, and more accurate because the model is specialized for screens rather than general-purpose.

Means a lot coming from someone building in the same space — rooting for your work too.

0
回复

Try Out Drizz right here, we are giving free launch day credits: https://www.drizz.dev/download-desktop-app

1
回复
I remember I had to buy one more test phone to run mobile tests in my previous org. I hope that is solved forever Congrats team Drizz🚀✨
1
回复

@manish_choudhary19  The "let me just buy one more phone" struggle is SO real 😅

You won't need to anymore, Drizz handles all of that for you!

Thanks so much, Manish 💜

0
回复

Complexity of the world is growing and we need more than just one test solution. As a fellow builder of AI Testing tools and Tools to Test for AI - I like the approach of Drizz. Running tests on mobile was already harder in the pre-AI world and I wish Drizz makes it easier for all of us. Best wishes team Drizz.

1
回复

@pradeepsoundararajan Means so much coming from a builder, Pradeep 💜
You're so right, one tool will never solve all of it. Mobile QA was already wild pre-AI, now there's a whole new layer. Rooting for Bugasura too!

0
回复

Curious how it handles multi-step flows where the UI state depends on previous actions — like a checkout that changes based on what's in the cart. Does the intent description need to account for that, or does Vision AI figure it out contextually?

0
回复

Congrats with the launch! How can one with pipeline of Maestro + Claude Code benefit Drizz?

0
回复

Hey, spent some time on Drizz's page and the no-script mobile testing angle is what pulled me in. one thing I kept thinking about, how does the agent handle non-deterministic UI like loaders, ads, or A/B variants? that's usually where script-free testing falls apart in my experience.

0
回复

can we try it?

0
回复

Yes@sudo_tallan !

Thanks for checking in.!

You can download the app and start right away!: Download the app from here: https://www.drizz.dev/download-desktop-app

Happy testing!!

0
回复

forgot to ask you... US and Polish Apple stores?

0
回复

When you write a test intent for iOS, does Drizz reuse that same intent to run on Android too, or do you author separately per platform? Curious whether the vision layer handles the UI differences automatically or if there’s still some per-platform config involved.

0
回复

Hey @sunnyallan ,

Very interesting question. With the intent, we generate the test steps, which are run in automation along with healer agents that would come to fix your test when test breaks due to UI diffs.

Now these test scripts generated are platform agnostic,. They can run on both android and ios , provided the the app flows are similar on both platforms.

0
回复

intent-based testing is the right abstraction here. one question: if the UI shifts but the intent doesn't (new onboarding screen, reordered step), does it auto-reconcile or do you redo the test? that's where most vision-ai tools fall apart for us.

0
回复

Hey@haili_yuan 

The problem you highlighted is the most common issue faced by QA and devs as of now. One small change in the UI leads to breakage of tests!

With our mission on simplifying and easing the testing journey. We have build a few agents that would be triggered when we identify the test steps are not aligning with the UI, they simply come , identify the missing steps / steps needed to reconnect the test case and voila no more test case falling appart due to new changes on UI.

Would love if you could try out a few scenarios on the tool!

0
回复

Do you expose test results as structured data? I want to pipe results into our data warehouse for trend analysis. Like test pass rates over time, most-failing flows, etc.

0
回复

Hi @md_saquib2 ,

We haven't really built much on test result export functionality yet. But I would be happy to give provide some MCP solution with which you can query your test results anytime in any format for your analysis.

0
回复

The no flacy slectors promise is honestly the most compelling part here mobile UI tests are notoriously painful to maintain at scale. Really curious to see how Drizz performs on fast-moving apps with frequent UI changes and complex flows.

0
回复

Hi @iisis__clements 

That’s exactly what we’re trying to solve, mobile UI tests shouldn’t become maintenance overload!.

Drizz is built to handle fast moving apps and UI changes without flaky selectors constantly breaking flows.

Checkout our case studies on our website: https://www.drizz.dev/case-studies and demo videos : https://www.youtube.com/@Drizzai/videos

Or better, try testing a flow your self: you can download the app from here: https://www.drizz.dev/download-desktop-app

Happy testing!!

0
回复

This is basically what mobile QA has been missing for years moving from brittle selectors to intent is a big shift.
The real test will be CI reliability over time, but if it holds up, it could seriously reduce maintenance overhead for teams shipping fast.

0
回复

@hassan__shah 

Exactly Hassan, that shift from selectors to intent is the core idea behind Drizz.

And yes, long-term CI reliability is the real benchmark. The scripts generated/authored by you can be seamlessly added into CI pipelines and run across widely available cloud integrations.

0
回复

We just closed our seed round and mobile is our first platform. Haven't set up testing yet. Is it too early to start with Drizz or should we wait until we have more features built?

0
回复

Hi @ragib_ahsan3 ,

Congratulations for closing the Seed round!

You can start testing right away with Drizz. If you encounter any scenario / cases that we are not able to cater, give us a day or two we will prototype and deploy the enhancement!

Looking forward!

0
回复

 I prototype in SwiftUI and sometimes my prototypes become the actual production code. Having tests would be great but I don't have time to write Appium scripts. Could I literally test my prototype with Drizz before handing it to the dev team?

0
回复

Hey@ragib_ahsan ,

The beauty of Drizz is that you can write test case in no-code natural language. Yes you can set the prototype and write the steps or even better, describe the test case for us and we will write the steps for you and you can seamlesly run the tests on your local or on your cloud device provider.

You can install Drizz from this link and give it a try.https://www.drizz.dev/download-desktop-app

0
回复

I'm on the customer support team, not engineering. When users report bugs, I can never reproduce them. Could I use Drizz to set up reproduction steps?

0
回复

Hey @saquib_ibrahim ,

With Drizz you can define the flow or write down the test steps in natural language and run across any device scenario , like reducing the network connectivity, lowering the device battery percentage etc, you can run the tests in regression and see how often the test breaks or what the breaking point it.

You want to give it a try?

0
回复

this is a great app. and also I want to test it out in my computer to know how it works. anyway, where's the link for me to download this app?

0
回复

@daviddprtma hey,

thanks for checking in. you can download from here: https://www.drizz.dev/download-desktop-app

0
回复

@daviddprtma Hey, you can download it right here: https://www.drizz.dev/download-desktop-app

0
回复

How does one install it?

0
回复

@louislecat Hey, you can download it right here:  https://www.drizz.dev/download-desktop-app

0
回复
0
回复
#3
Composer 2.5
Cursor’s most powerful model yet
340
一句话介绍:Composer 2.5 是针对 Cursor 代码编辑器的深度定制模型,在长时间、多文件、多步骤的 Agent 式编程任务中显著提升了智能性与行为一致性,解决了通用模型在复杂编码场景下的“中途掉线”和“上下文丢失”痛点。
Artificial Intelligence Development
AI编程助手 代码编辑器 Agent式编程 长上下文 模型优化 Cursor Composer 多文件编辑 任务链稳定 云端IDE
用户评论摘要:老用户肯定其进步,尤其长任务稳定性和努力校准。但也有直言不讳的对比,认为仍不及 Opus 4.7 和 Sonnet。用户关注点集中于:多文件语义冲突处理、长时间任务链(20+步)的状态管理、是否集成日志/错误跟踪的运行时感知、以及针对长轨迹的训练反馈机制细节。
AI 锐评

Composer 2.5 不是一次炫技式的发布会,而是 OpenAI/Anthropic 套壳之外的“局内人”突围。它的聪明之处在于:不吹嘘基础能力对标 GPT-5,而是诚实地承认“继续基于 Composer 2 微调”,并在“努力校准”和“长序列文本反馈”这些工程化的细节点上死磕。这在 AI 编码工具同质化严重的当下,是一种正确的“反卷”策略。

优势很明显:它精准命中了 Cursor 重度用户的痛点——不是写一行代码的聪明,而是写十个文件、重构几十个函数时,模型能否保持“清醒”。这种针对特定工作流的强化训练,比单纯堆算力的基座模型更有效地提升了用户的“有效编码产出”。从用户反馈来看,那种“不再需要 Opus”的呼声,正是产品精准定位的胜利。

但危险信号同样清晰:评论中有人直言“比不上 Opus 4.7 或 Sonnet”,这说明在单一任务或复杂推理的绝对上限上,微调模型仍有天花板。另外,用户关于“多文件语义冲突”和“长久任务状态管理”的提问非常尖锐,这背后其实是指出了 Cursor 当前架构的软肋:模型本身不错,但 IDE 与 AI 之间的环境感知和冲突解决机制仍然粗放。如果这些系统级体验不做强化,Composer 2.5 再强也难免被更原生集成的 Agent 产品(如 Devin)降维打击。

一句话总结:这是一次优秀的增量优化,让 Cursor 在 Agent 编程赛道保住了基本盘,但距离“终结 IDE 之争”还有很长的工程化道路要走。

查看原始信息
Composer 2.5
A substantial improvement in intelligence and behavior over Composer 2, particularly on long-horizon agentic tasks.

Hi everyone!

More training scale, still aggressive pricing, and a better model for long-running coding work.

That is Composer 2.5 in a nutshell.

It continues from the same base as Composer 2, and this version is trained to be better at sustained work, complex instructions, communication style, and effort calibration inside Cursor.

Targeted textual feedback helps the model improve specific mistakes inside long rollouts, while 25x more synthetic tasks push it into harder coding problems grounded in real codebases.

Looking forward to the next larger model trained from scratch with 10x more compute!

9
回复

@zaczuo  The improvement on long running coding tasks is really noticeable. Great to see this focus on sustained work quality.

0
回复

It continues from the same base as Composer 2.

@zaczuo S/O to Moonshot's Kimi K2.5!

1
回复

@zaczuoIncredible product momentum on the hunt today! 🚀

Watching so many teams push complex AI workflows and multi-file structures is fascinating. However, scaling these data-dense architectures usually hits a massive wall when it comes to front-end logic, leading to heavy UX bottlenecks and fragmented visual layout consistency.

As a Senior AI Product Designer, my focus is entirely on solving exactly that—bringing institutional-grade visual authority and seamless interfaces to high-stakes products right from scratch.

I operate completely asynchronously to protect deep work and deliver premium execution. If any founders or teams here have a stuck build or an overloaded design pipeline, my async desk is open. Drop me a line, let’s build! ⚡💎

0
回复

Love cursor, been a power user for a long time now, and happy to see the native model getting better and better.

2
回复

Nice to see the new models and I have been using Composer this morning - sorry for the reality check - but c'mon - Composer 2.5 does not really compare well against Opus 4.7 or Sonnet.

1
回复

That was a no-brainer upvote! Testing Composer 2.5 since yesterday, and I'm extremely impressed. 2.0 was already a good one, but with 2.5, I feel like there's no need for Opus or GPT-5.5. Great job!

1
回复

The focus on long-horizon agentic tasks is the right unlock for real coding workflows. We've run into this building RetainSure where the model drops context mid-task once the chain gets 20+ steps deep. How does Composer 2.5 handle state management across very long multi-file agentic sessions, and is there a hard limit on task length before it degrades?

1
回复

Has the team experimented with giving the agent more awareness of runtime behavior — like logs or error traces — so it can reason about what's actually happening vs just what the code says? Curious if that's on the roadmap.

0
回复

The effort calibration piece stands out to me. We've had agents lose coherence around step 6 or 7 in a long agentic chain, and it's tough to know if that's model drift or context decay. Curious how the targeted textual feedback is applied: at the step level within a rollout, or on the full trajectory?

0
回复

Specializing a model for code editing workflows rather than pure generation is the right call. Generic models fall apart on multi-file edits because they lose track of interdependencies. Building RetainSure, we've hit that quality cliff when an agent touches more than three or four tightly coupled files. How does Composer 2.5 handle semantic conflicts when simultaneous edits across files introduce inconsistencies?

0
回复
#4
Mantle Chat
Collaboration platform where teams work with AI together
291
一句话介绍:Mantle Chat 是一个将实时团队通讯、多模型AI(GPT/Claude/Gemini等)、自主智能体及30+工具集成于一体的协作平台,旨在解决团队在AI使用中“信息孤岛”、上下文割裂和重复工作的问题,让AI成为团队协作的一部分而非个人私用工具。
Productivity Messaging
团队AI协作平台 智能体 多模型聊天 团队知识库 工具集成 企业级AI 生产工具 实时通讯
用户评论摘要:用户普遍认可“共享AI上下文”避免知识蒸发和重复工作的价值。核心问题集中在:如何实现跨会话的长期记忆及项目间上下文隔离;是否支持HubSpot/Zendesk等销售客服集成;以及多智能体在频道内并行工作的上下文管理细节。
AI 锐评

Mantle Chat的立意找准了一个真痛点:AI工具正在让团队协作变得更加原子化。当每个成员都在自己的ChatGPT、Claude、Gemini对话框里孤军奋战,所谓的“AI提效”其实是在制造新的信息烟囱。Mantle试图用“频道+@提及”的模式,将AI拉回团队协作的主战场。

从产品逻辑看,它本质上是一个“AI First”版本的Slack/Discord,但核心差异在于它把模型选择、智能体执行和工具集成作为一等公民,而非事后添加的bot。这解决了两个关键问题:一是降低了AI采用的门槛,非技术人员无需IDE或复杂配置就能用@召唤智能体;二是让AI产出的内容(分析、代码、方案)直接沉淀在团队对话上下文中,形成可检索、可迭代的“机构记忆”。

然而,产品面临的挑战也很明显。首先是竞品挤压:Slack和Teams正在将AI助手深度内嵌,用户是否愿意为“另一个聊天工具”迁移?其次是“上下文隔离”与“共享智能”之间的平衡难题。从用户质疑看,如果不同项目、不同智能体的上下文管理不够清晰,AI反而会因信息混杂而变得“智障”。Mantle目前的workspace隔离和线程机制算是及格,但距离“长期记忆”和“精准上下文感知”仍有距离。最后,30+的集成列表看着很美,但HubSpot、Zendesk等关键B端工具的缺失(仍在路线图),暴露了其当前更偏技术团队和初创公司的定位。

客观评价,Mantle Chat不是颠覆式创新,而是对现有“聊天+AI”模式的精细整合。它真正的价值在于降低团队协作中AI的使用摩擦,让“AI辅助”从个人体验变为团队资产。能否成功,取决于能否在保持轻量体验的同时,将“共享上下文”这个核心卖点做到足够精准和可靠,而非沦为又一个功能堆砌的聊天室。

查看原始信息
Mantle Chat
Mantle Chat helps teams boost productivity and adopt AI faster by bringing real-time messaging, AI model chats, autonomous agents, and tool integrations into one shared workspace. Communicate in Discord/Slack-style channels, mention AI agents and models (GPT, Claude, Gemini, Grok, Deepseek) with @ whenever you need help directly in conversations with teammates, build agents, run autonomous tasks together, and connect 30+ tools (Notion, Linear, Gmail and more) your team already uses.

Hey Product Hunt community 🙌

I’m Katja, co-founder of Mantle Chat.

We built Mantle Chat because we think AI at work should feel collaborative, not isolated. When AI is part of the team workspace, it becomes more powerful, easier to adopt, and more natural for everyone to use.

The idea is simple: instead of everyone using AI alone, the whole team can learn, collaborate, and work with AI together.

How we got here?

We spent time talking to modern teams and noticed one thing: teams are already using AI, but not really together.

They have one tool for team messaging, another set of tools for AI, and a lot of work happening in between. Team communication happens in Slack or Teams, while AI work happens in personal conversations with ChatGPT, Claude, Gemini, and other tools. This creates a few painful problems:

  • Constant switching between team messenger, AI tools, docs, and tasks

  • Context gets lost every time people jump between apps

  • Slack + multiple AI subscriptions quickly become expensive

  • People use AI quietly, but don’t always share how they’re using it

  • AI conversations stay private, so teammates duplicate work

  • The AI has no real context on what the team is working on

  • Prompts and responses get copied around manually

  • Non-technical teammates often find AI harder to adopt

  • Companies want everyone to use AI, but most AI tools still feel individual or developer-focused

The result is strange: AI is everywhere, but teamwork around AI is still fragmented.
So we asked ourselves: what if AI was not a separate tool, but part of the team workspace itself?

What we built?

Mantle Chat is a collaborative AI workspace for teams.

It brings real-time team messaging, AI models, agents, tool integrations, and shared knowledge into one place.

What you can do in Mantle Chat:

  • Chat with your teammates in Slack/Discord style channels, DMs, and threads.

  • When you need help from AI, you can mention (@) 60+ models GPT, Claude, Gemini, Grok, DeepSeek, directly inside the conversation.

  • The AI can respond in the same thread, where the whole team can see it, discuss it, reuse it, and build on top of it.

  • You can also keep private AI chats when you need to work individually.

  • Mantle Chat gives teams a shared knowledge base, so uploaded docs, workspace context, and project information can become part of the AI’s understanding. Instead of manually copy-pasting background into every prompt, your team can give AI the context it needs once and use it across conversations.

  • Teams can also build shared AI agents with custom instructions, knowledge, schedules, and integrations. These agents can run manually, on a schedule, or be triggered by connected tools. (For example, you can create agents for research, PR reviews, meeting notes, analytics reports, customer requests, Linear updates, Stripe events, and more.)

  • Mantle Chat connects with 30+ tools, including Linear, GitHub, Google Drive, Notion, Slack, and Stripe, so AI can help with real workflows, not just answer questions.

  • Give the whole team access to AI without IDEs or complicated setup

  • Reduce the need for separate team chat and AI subscriptions

And honestly, figuring out AI as a team is much more fun and effective than doing it alone.

Is it for you?

Mantle Chat is built for teams that already use AI, or want to start using it together.

  • For startup teams: Mantle Chat gives you one place to talk, think, plan, and work with AI. You can keep product discussions, customer insights, research, and AI-generated ideas visible to the whole team.

  • For product and design teams: Mantle Chat helps you brainstorm, summarize feedback, compare ideas, write specs, analyze research, and keep AI outputs connected to the conversations where decisions happen.

  • For engineering teams: Mantle Chat lets you bring AI into technical discussions, create PR review agents, summarize Linear issues, connect GitHub, and reduce context-switching between tools.

  • For operations and support teams: Mantle Chat can help automate recurring workflows, generate reports, summarize requests, and create agents that run on a schedule or react to events.

  • For non-technical teammates: Mantle Chat makes AI feel approachable. You do not need an IDE or developer workflow to use agents, models, and automations. You just work inside chat.

  • For cross-functional teams: Mantle Chat helps product, design, engineering, operations, support, and leadership stay aligned by keeping conversations, AI outputs, shared knowledge, and workflows in one place.

See it in action:
https://www.youtube.com/watch?v=GVoFZ2dxVx0

You can try the interactive demo in the hero section of our homepage:

https://mantle.chat/home

For the Product Hunt community:

You can use Mantle Chat for free, or try the Pro plan free for 7 days with higher limits and all features.

Contact with us:

Website: https://mantle.chat
X / Twitter: https://x.com/MantleChat
LinkedIn: https://www.linkedin.com/company/mantlechat
Community: https://discord.com/invite/STzq94kdDC

Huge thanks to @fmerian for hunting us today, and to the Product Hunt team and community for the opportunity to share what we’ve been building.

Please support us today and drop a comment!

We’d love to read your feedback, any thoughts on how we can make shared AI work better for teams, and how your team is using AI today: together, separately, or somewhere in between.

11
回复

@fmerian  @katja_danilina awesome launch Katja! maybe add detailed usage analytics per seat so growth teams can map out which internal ICP are adopting AI fastest.

3
回复

@fmerian  @katja_danilina Hi Katja, Congrats on the launch. I'm interested in understanding more about shared team context.

1
回复

@fmerian  @katja_danilina This resonates so much! 🙌 As a solo indie maker, I feel this pain even more — juggling between Slack, ChatGPT, and my dev tools. The 'AI in isolation' problem is real, and solving it at the team level makes total sense. Love the collaborative approach. Upvoted! 🚀

P.S. — Launching iBGremove today too, love seeing fellow makers tackle real workflow problems!

1
回复

One underrated benefit here is institutional memory . Shared AI conversations are way more valuable than knowledge trapped in private chats.

2
回复

@gabriel_brooks1 Thanks you! Definitely, private AI chats are useful for individual productivity, but shared AI threads turn that productivity into team knowledge.

0
回复

wonderful - good luck!

2
回复

@nikolas_dimitroulakis Thanks Nikolas! 🙌

0
回复

Congrats on the launch @katja_danilina !

Something on the roadmap: Mantle already has a pretty strong list of integrations. Are you planning to add HubSpot or Zendesk so agents can also help with sales and customer support workflows?

1
回复
@byalexai thank you! Yes, 100%! We were actually working on HubSpot integration, but had to temporarily park it due to other priorities. We are gonna add HubSpot and Zendesk support in the near future. We want to allow users to benefit from agentic assistance within the sales and customer support user journeys.
1
回复

Does Mantle keep a persistent shared context across sessions — so the AI understands decisions the team made in previous conversations — or does each new session start fresh? That long-term memory layer seems like the thing that would make it genuinely more useful than just adding AI to a Slack channel.

1
回复

@sunnyallan Thanks for your question!

Mantle Chat doesn’t treat every new session as completely fresh, if you add context to the workspace knowledge base.

Teams can add shared files, instructions, and context at the workspace level, and agents can use that across conversations. We also support agent-level knowledge bases, so each agent can have its own specialized context.

We’re also working on the desktop app and local file access, so the context layer can go beyond just what was said in a single channel or thread.


1
回复

Love the "@ the model in the same channel as your teammates" framing, that's the part Slack-with-a-bot-bolted-on always got wrong. Curious how you're handling context isolation when multiple agents are working on tasks in parallel in the same channel? Does each agent get its own thread, or do they share the channel history?

1
回复

@mahdi_nouri Thanks, that’s a really good question.

We support both patterns. In channels, when you "@ an agent or model", it can work with the shared channel context, so teammates and agents are aligned around the same discussion. And when a task needs more focus or isolation, you can turn any specific message into a thread from the channel. Threads have their own separate context/history, so multiple agents can work in parallel without mixing up task-specific details.

0
回复

I’ve been jumping between Claude and ChatGPT tabs for months depending on what I’m working on. Never really found a smooth way to keep both in one workflow.

Mantle looks interesting for that. Curious to see if having one workspace actually solves it.

1
回复

@amraniyasser Thank you, feel free to reach out if you need more info.

1
回复

Hey Katja! You're awesome. AI and team chat integrated is a game changer cause it comes to upgrade teamwork. I'm sure many founders gonna love this!

1
回复

@german_merlo1 Thank youuuuuu!

1
回复

Shared AI context across a team is way better than everyone running isolated prompts. Mantle Chat looks like it could cut down a lot of duplicated work in growing teams. We've been building in the customer success for ops-heavy SaaS teams space at RetainSure, and Mantle Chat touches on something we think about a lot. How does it handle context isolation between different projects?

1
回复

@shivam_jaiswal21 Thank you for your comment! We’ve thought about it: Mantle Chat is structured around organizations, where teams can create separate workspaces for different needs: for example, departments, projects, or teams. This helps keep shared context useful without mixing unrelated project knowledge. It’s also easy and intuitive to switch between workspaces using the bottom panel, which was specifically designed for this workflow. You can create as many workspaces as you want.

1
回复

Shared AI threads are the right move. The problem with private AI sessions is that your team's best prompts and outputs never get socialized. At RetainSure we've got CS reps doing similar AI-assisted tasks in silos, so we lose institutional knowledge constantly. Do you have any versioning or replay for how an AI-assisted decision was reached in a shared channel?

1
回复

@dhiraj_patel5 Thank you for your great question! Right now, we have replays in shared chats and we don’t have versioning, but we agree it’s an important direction for shared AI workflows.

1
回复

Congrats on launching! Wishing you a smooth start, happy users, and lots of success.

1
回复

@gayatri_sachdeva Thanks a lot for your support!

1
回复
looks promising team
1
回复

@abod_rehman Thank you!

0
回复

the positioning against private AI chats is the interesting angle here. the actual problem is that when everyone's prompting Claude or GPT separately, institutional knowledge just evaporates. nobody knows what their teammate already figured out. if Mantle actually solves that it's solving something real

1
回复

@ansari_adin Thank you so much for sharing your thoughts!

0
回复

Congrats on the launch! This looks really promising! Been looking for something like this! @katja_danilina

1
回复

@nafis_amiri Thank you very much for your support!❤️

0
回复

Congrats on the launch! The idea of combining chat, agents, and tools in one shared workspace feels very practical for real team workflows. You can also say that the @ plus agent approach inside channels is important in reducing context switching.

1
回复

@thamibenjelloun Thanks for your comment! Agree, the "@agent" approach inside channels is a great way to reduce context switching by bringing the right help directly into the conversation where the context already lives.

0
回复

Right now we're paying for 4 different AI tools across a 6 person team. No shared history, no shared context, and the bill keeps growing. This looks like the first real alternative. Want to give it a proper test.

1
回复

@andrzej_zarod Thanks for your comment! This is exactly what we’re trying to solve: one shared AI workspace for the whole team, with shared context and history.

Feel free to try it out, we have a free tier for teams.

0
回复

Does Mantle keep a persistent shared context across sessions — so the AI understands decisions the team made in previous conversations — or does each new session start fresh? That long-term memory layer feels like the key differentiator for async teams.

0
回复

@sunnyallan I think this comment is duplicated, I already replied here: https://www.producthunt.com/products/mantle-chat?comment=5382624

0
回复
#5
CtrlOps
Deploy, Debug & Manage Linux Servers with AI.
215
一句话介绍:CtrlOps 是一款面向开发者的 AI 驱动 Linux 服务器管理工具,核心场景是让非 DevOps 人员通过自然语言指令、可视化文件管理和一键部署,摆脱对 IP 表格、SSH 混乱标签和记忆命令的依赖,将原本耗时60分钟的部署降至5分钟。
Linux Developer Tools Artificial Intelligence
AI服务器管理 DevOps简化 Linux运维 终端助手 可视化部署 本地安全 服务器监控 脚本库 开发者工具 免Agent
用户评论摘要:用户普遍共鸣“SSH标签混乱”和“依赖单一运维高手”的痛点。核心问题集中在安全机制(确认所有AI命令需人工审批)、多服务器管理能力(支持100台)、以及调试会话的上下文连续性。HR角色也提到SSH管理简化了员工离职权限撤销流程。
AI 锐评

CtrlOps 精准狙击了“会写代码但不想运维”的开发者群体——这几乎覆盖了全行业。其核心价值并非发明新技术,而是通过“AI生成+人工确认”的闭环,将服务器管理从“玄学”降维成“可复用的日常操作”。产品设计的聪明之处在于,它没有试图取代人类运维,而是将AI定位为一个“永不睡觉、永不跳槽”的资深助手,在生成命令后强制要求人工审批,既规避了黑盒执行的信任崩塌,又保留了开发人员对生产环境的控制权。

从用户反馈看,真正的杀手锏不是AI终端本身,而是其“脚本库”功能——AI解决“第一次”的不知所措,脚本库解决“每一次”的重复劳动,这种组合拳形成了从试错到固化的高效闭环。此外,100%本地运行、凭证不离开机器的设计,直击企业对数据安全的敏感神经,这是许多云端AI运维工具难以逾越的护城河。

然而,该产品面临的真正挑战是:当规模超过100台服务器时,AI的上下文管理能力是否会成为新瓶颈?当前“会话级记忆”而非持久化记忆,对复杂故障排查的连续性来说略显鸡肋。此外,一旦用户习惯了AI建议,团队是否会逐渐丧失土法运维的底层能力?这种“运维外包大脑”的风险,CEO们需要权衡。总而言之,CtrlOps 是一款优秀的“降本增效”工具,但距离成为“智能运维中枢”仍有距离。

查看原始信息
CtrlOps
Most devs manage servers from a spreadsheet of IPs and commands nobody remembers. CtrlOps gives you AI-powered server management without DevOps expertise. AI terminal that generates commands with your approval. Scripts library. One-click deploys from any GitHub repo. Visual file manager. Real-time server monitoring. Zero agents on servers. Deployments that took 60 minutes now take 5. 100% local. Your credentials never leave your machine. Mac. Windows. Linux.
Hey Product Hunt fame👋 I ran a dev agency for 5 years. We managed servers for dozens of clients across different stacks, different regions, different everything. And every single time, the setup was the same. A spreadsheet with IP addresses nobody kept updated. SSH tabs open with names like "bash" and "bash (2)". Someone is googling a command they had run 50 times before. A deployment that should have taken 10 minutes turned into an hour because one environment variable was wrong. We had 2 DevOps guys on the team. But whenever something was urgent, they were never reachable. And the rest of us were left staring at a terminal, hoping we remembered the right command. Every client had their own server. To check something as simple as "is the site running?" someone had to open a terminal, find the right IP, dig up the credentials, and log in. Separately. Every time. For every client. We had no unified view. No quick way to know what was happening across our infrastructure without pulling in the one person who knew how to navigate it all. Everything ran through him. If he was unavailable, we were blind. We got tired of that. So we built CtrlOps. The idea was simple, what if managing a server felt as normal as using your laptop? Named servers instead of IPs. A file manager instead of SFTP. A terminal that understands plain English. Developers started doing things on servers they would never have attempted before, because they could see exactly what would happen first. That is what CtrlOps is really about. Not replacing DevOps. Just making servers feel less like a minefield. 1 month free, no credit card needed. Try it and tell me what breaks. I read every reply. What is the most stressful server situation you have ever been in?
32
回复

@parth_makwana07 The "bash (2)" tab hit too close to home 😭 We literally had a sticky note on the monitor saying which terminal was which. Absolute chaos.

What you built here is what everyone needed but nobody sat down to actually make. Congrats on shipping it — this is the one 🔥

5
回复

Some real stories from people using CtrlOps:

A developer noticed their server was running slow. Instead of guessing, they opened the AI terminal and typed "why is this server slow?" CtrlOps flagged an unfamiliar process consuming 90% CPU. Turns out it was a crypto miner. They identified it, killed it, and secured the server in under 10 minutes. Without CtrlOps, that miner could have run for weeks burning resources and money.


Another team spent 2 days manually debugging a production issue. Logs, SSH sessions, trial and error. Nothing. They connected the server in CtrlOps, asked the AI terminal what was wrong, and it pinpointed the issue in minutes. A misconfigured environment variable that was silently breaking things. Their own team could not find it in 48 hours. CtrlOps found it in one question.


This is what gets me excited. Not the features. The moments where someone solves a problem they thought required a DevOps expert. And they do it themselves.


Got a server horror story? I would love to hear it.

19
回复

@daxesh_italiya 48 hours vs one question. That's not a feature, that's a whole career moment for that dev 🤯

The crypto miner story is wild. Most teams wouldn't have found that for weeks. Genuinely scary how common that is.

This is the kind of product that makes you feel like you have a senior DevOps engineer on call 24/7. Love it 🚀

0
回复
deployments don't stress me out anymore, and that feels weird to say lol. paste repo, fill env, toggle SSL, done.Genuinely cannot remember the last time something broke mid-deploy since switching to this.
5
回复

@tocza I literally don't want to see that stress again that's why we built CtrlOps

0
回复

ok so the file manager sounds boring, I know. But I was doing everything through a separate SFTP client before this. separate login, separate window, separate headache every time.


now i just open it inside CtrlOps and edit configs directly. for someone managing multiple client servers, this is honestly the feature i use the most. more than the AI stuff even.

5
回复

@ga4p Thanks for giving it a try with CtrlOps and sharing your honest review!

0
回复

Managing Linux infrastructure through natural language sounds powerful but a bit terrifying from a security standpoint. Having a terminal assistant help debug server configurations could save hours of parsing logs. What kind of guardrails or confirmation steps are in place before destructive commands execute?

4
回复

@rivra_dev, totally fair concern and honestly the right question to ask.

Every command the AI generates has to be explicitly approved by you before it runs. It shows you the exact

command, what it does, and waits for your click.

Nothing executes automatically, ever. on top of that everything runs locally. your credentials,

SSH keys, AI keys all stay on your machine, encrypted with AES-256. nothing goes to any cloud or external server.

So the flow is always: AI suggests, you read it, you decide.

More like a very knowledgeable colleague showing you what to run than a bot that takes over your terminal.

1
回复

The “spreadsheet with random server IPs and commands” line is too real 😂

Really like what you’re building here.
Making server management simpler without needing full DevOps knowledge is a big win for a lot of developers.

Congrats on the launch @parth_makwana07, @hiren_kalariya & @daxesh_italiya 🚀

3
回复

@ajaypatel9016 Thanks a lot for supporting always :)

0
回复

finally something that replaces my mess of ssh tabs and random bash scripts. the playbook feature is underrated, set up my common fixes once and now its just one click. great launch guys..

3
回复

@prakash_vasani Love hearing this 🙌

That exact “too many SSH tabs + random scripts everywhere” pain is what pushed us to build CtrlOps in the first place.

Really glad the Playbooks feature is saving you time already. Appreciate the support and kind words a lot 🚀

0
回复

I have 100 servers in production, can I manage them using this application?

3
回复

@janaki_vasani Yes, you can easily manage.

1
回复

This hits way too close to home. The "bash" and "bash (2)" terminal tabs alone gave me flashbacks 😅

The pain point you're solving is so real — server management has always felt like it was gatekept behind one person who "just knows" how everything works. The moment that person is unreachable, the whole team is paralyzed.

What really stands out to me is the plain-English terminal idea. Lowering that barrier means developers can actually own their environment instead of depending on a single DevOps hero. That's a huge shift in team dynamics, not just tooling.


The "named servers instead of IPs" detail is small but brilliant — it's the kind of UX decision that shows you built this from real pain, not from a whiteboard.


Congrats on the launch! Can't wait to see how teams adopt this. 🚀

2
回复

@anand_patel6 The "bash" and "bash (2)" situation is one of those things that is funny until it is 2 AM and prod is down, and you genuinely cannot tell which tab is which.

You put it better than we have in any of our own copy, honestly. The single DevOps hero problem is exactly what we kept coming back to while building this. It is not just a tooling problem; it is a team resilience problem. When one person holds all the server knowledge in their head, the whole team's ability to ship is tied to that person's availability.

The named server's details are one of those things that sound too simple to matter until you use them every day. Every decision like that in CtrlOps came from something that actually happened to us, not something we designed on a whiteboard.

Really appreciate the thoughtful comment. Thank you for the support today, means a lot on launch day.

0
回复

genuinely curious how the web search feature works in the

AI terminal. does it pull from the actual docs or just

general search results? asking because we work with some

niche tools and outdated answers are a real problem.

2
回复

@tejas_rangani Great question and a real one because most AI tools just hallucinate an answer for niche tools rather than admitting they don't know.

When web search is on the AI searches the web in real time before generating any command. so it is pulling from actual current documentation, release notes, Stack Overflow, GitHub issues, whatever is most relevant. not from its training data.

For niche tools this makes a significant difference. if the tool released a breaking change 3 months ago the AI will find that and give you the right command for the current version, not the one from 2 years ago.

Would love to hear which tools you are working with if you end up trying it.

1
回复

What’s the biggest time saver in practice, is it the scripts library or the AI generating the commands?

2
回复

@thamibenjelloun honestly depends on the workflow, but if I had to pick one, it is the AI terminal for the first few weeks, and then the scripts library takes over after that.

The AI terminal saves you in the moment. Something breaks, you do not know the exact command, you just ask, and it figures it out. That is the immediate relief.

But once you have solved the same problem a few times, the script library is where the real compounding happens.


You save it once, add your variables, and next time it is one click across every server instead of going back

to the AI or googling again.

So the AI gets you unstuck, and the script library makes sure you never get stuck on the same thing twice.

2
回复

HR person commenting on a server tool, I know.

But whenever someone leaves the team, we need their server
access gone immediately. Before this it was a whole back and
forth with tech. Now I check SSH management myself and flag
it in 2 minutes. Offboarding got so much easier, honestly.

2
回复

@chandni_hr, this is actually one of the most underrated use cases

We heard while building it. The security risk of delayed Offboarding is real, and it always falls through the cracks

because it depends on someone from tech having bandwidth at exactly the right moment.

Glad SSH management is making that faster for you.

That visibility should not require a tech person in the loop.

0
回复

The AI-assisted debug loop for Linux servers is something we've wanted at RetainSure for a while. Chasing down intermittent issues across multiple EC2 instances usually means a lot of context switching between logs, metrics, and SSH sessions. Does CtrlOps maintain state across a debugging session, so the AI can reason about what it already tried before suggesting the next fix?

1
回复

@anand_thakkar1 That is one of the more technically specific questions we have gotten today, and we appreciate you framing it that way.

Within a session, the AI terminal maintains full conversation context, so it knows what commands have already run, what the output was, and what has been tried. You can literally say "that did not fix it," and it will reason from there rather than suggesting the same thing again.

The multi-server piece is where it gets interesting for your EC2 use case. You can have separate AI terminal sessions open across instances, and the context is maintained per server. So if you are chasing an intermittent issue across three instances, you are not starting from scratch each time you switch.

Persistent memory across sessions is on the roadmap. Right now, if you close and reopen the terminal, you start fresh, but within an active debugging session, the context stays intact throughout.

Would love to hear how the intermittent issue debugging goes if you do try it on your EC2 setup.

0
回复

Making infra management conversational is clever but the hard problem is safe command scoping. An AI that can debug is also an AI that can accidentally drop a table. We've been careful at RetainSure about giving AI systems any write access to production infra. How do you scope what CtrlOps can actually execute? Restricted user, sandboxed session, or something else?

1
回复

@retain_dev, this is exactly the right question and the kind of pushback we genuinely welcome because it means you are thinking about it seriously.

CtrlOps does not execute anything autonomously. The AI generates the command and stops. You see the exact command, you read it, and you approve it manually. There is no background execution, no auto-run, no scheduled AI actions. The human is always the last step before anything touches the server.

So the scoping question becomes less about restricting the AI and more about who you give approval rights to inside your team. The AI can suggest a DROP TABLE, but it cannot run it. Your engineer still has to read it and click approve. That moment of human review is the actual guardrail.

For teams like yours that are careful about write access to production, the practical workflow is to use CtrlOps with a read-only SSH user on prod and a full access user on staging. The AI works the same way either way, but the blast radius if someone approves something they should not is contained.

would be curious what your current setup looks like at RetainSure if you are open to sharing.

0
回复

The approve before execute thing is what sold me.

Every other AI tool just runs stuff, and you find out

What happened after.

1
回复

@bhautik_kapadiya AI should not run blind, human is must be in the loop!

0
回复

Yeah I've done this exact thing. Wrong tab, wrong server, restarted nginx thinking I was on staging. Took down prod for an hour. The part where it shows you exactly which server you're on before anything runs is what got me.

1
回复

@abhikatrodiya that hour is something every developer remembers exactly. The sick feeling when you realize which server you are actually on is a very specific kind of panic.

That is honestly why we built the server context so visibly into every screen. You should never have to wonder which server you are on. It should be impossible to miss.

Glad that part landed for you.

0
回复

The preview step is the whole game when AI touches live infra. CtrlOps gets it right: ask in plain English, see the exact command before it runs, approve. Been running it alongside ClawMetry and the fit is natural. Congrats Hiren and team 🚀

1
回复

@vivek_chand exactly this. The moment you remove that approval step you have a tool that is impressive in demos and terrifying in production. We were never going to ship it any other way.

0
回复

This gonna be the best experience for someone like me who don't like tinkering around CLIs 🫠

1
回复

@ayush_pandey15 Even without deep DevOps info we you can manage Linux servers, bro!

0
回复

Hey, went through CtrlOps's site and the AI-on-the-Linux-shell angle is what stuck with me. one thing on my mind, how do you handle destructive commands, is there a confirm gate before things like service restarts or migrations or does the agent just send it? blast radius is the part I'd want to understand here.

0
回复

this is exactly what i wanted last week debugging my railway worker at 11pm honestly 😭

q> does it work with hosted platforms that hide raw ssh access (railway, render, fly)? feels like that's where most indie devs are landing now instead of raw VPS. TY

0
回复

Congrats on the launch. The "AI suggests, you approve" framing is strong, especially for something as sensitive as servers.

Curious what users ask for first once they trust the AI terminal: safer debugging, deploy checklists, reusable scripts, or monitoring?

0
回复

the fact that i dont need to install any agent on my servers sold me immediately. got it running on our staging env and already caught 2 issues before they became outages. will be moving prod over soon

0
回复

@srushti_vasani That is the best kind of validation, catching things before they become incidents rather than debugging them at midnight.

The no-agent decision was non-negotiable for us from the start. Anything that requires you to touch every server before you can even use the tool creates friction and a security surface you did not ask for. Standard SSH and nothing else.

Really glad staging is working well. Would love to hear how prod goes when you make the move.

0
回复

Whenever I read about AI systems managing servers, it always scares me :) Is there no risk it could accidentally delete something?

0
回复

@natalia_iankovych Totally understand the hesitation, and honestly, it is the right instinct to have.

The short answer is that CtrlOps cannot delete anything on its own. The AI generates a command and then stops completely. You see exactly what it wants to run, you read it, and you decide whether to approve it or not. Nothing executes without your explicit click.

So even if the AI suggested something destructive like a delete command, it is sitting there waiting for you to approve it. That moment of human review is the entire point. We built it this way specifically because nobody should trust an AI with unilateral access to their servers.

Think of it less like an AI that manages your servers and more like a very knowledgeable colleague who tells you exactly what to type and waits for you to type it yourself.

0
回复
Congratulations on the launch!
0
回复

@makadiaharsh thanks for the support today.

0
回复

Yeah I've done this exact thing. Wrong tab, wrong server, restarted nginx thinking I was on staging. Took down prod for an hour. The part where it shows you exactly which server you're on before anything runs is what got me.

0
回复

@rutvik_vaghela That hour is something every developer remembers exactly. The sick feeling when you realize which server you are actually on is a very specific kind of panic.

That is honestly why we built the server context so visibly into every screen. You should never have to wonder which server you are on. It should be impossible to miss.

Glad that part landed for you.

0
回复

as a solo founder wearing the devops hat, this fills a gap i didnt know i needed filled. one dashboard to rule them all. solid launch

0
回复

@ruchita_italiya  Glad it landed. One dashboard was always the goal, not another tool to add to the pile.

Congrats on everything you are building, and thanks for the support today.

0
回复

the file manager feature is the one nobody talks about

but everyone needs. separate SFTP client is such a

pain when you just want to edit one config file.

0
回复

@abhishek_akbari exactly this. the SFTP client situation is one of those things where everyone has just accepted the pain for so long that they stopped noticing it.

0
回复
#6
Motion
A video agent for tasteful motion design
170
一句话介绍:Motion 是一个专注于“有品味”动态设计的AI视频代理,通过智能研究、分镜和元素级可编辑工作流,解决用户反复生成AI视频时“差不多但不够好”、修改成本高且缺乏专业审美的痛点。
Artificial Intelligence Animation Video
AI视频生成 动态设计 代理式AI 可编辑视频 分镜制作 营销视频制作 视频编辑 AI创作工具 产品发布视频 设计品味
用户评论摘要:用户普遍认可其元素级编辑能力,称“终于不用从头再生成了”。核心疑问包括:能否单帧替换元素、“品味”的具体来源(是否有风格库)?另有用户反馈邀请码无效。团队回应可通过提示词、链接或上传设计文档来控制风格。
AI 锐评

Motion的吸引力不在于“生成速度”,而在于对“成品感”的追求。它敏锐地捕捉到了当下AI视频工具的一个致命缺陷:生成太快太廉价,但修改慢如噩梦,结果像“老虎机”一样随机。用户那句“almost good enough is not good enough”是行业写照。

其核心杀招是“后生成编辑性”。将AI视频从黑盒生成转换为一个可拆解、可操作的元素级项目,这本质上是在AI视频领域复刻了从“静态图跑图”到“Canvas可编辑”的范式跨越。正是这个能力,让它从“生成器”升格为“工具”,让创作者敢用、愿意用,而不是只拿来跑Demo。

然而,风险同样在于“品味”。团队声称“有品味”,但品味本质上是主观且昂贵的——它依赖于精准的数据、风格库和复杂的CMF(色彩、材质、质感)控制。如果仅仅靠提示词或网站链接爬取风格,在实际多人协作与品牌一致性场景中,很容易滑向“及格但平庸”。另一个隐患是:当用户“全功能”编辑时,底层逻辑是否会退化为传统编辑器?若控制与智能之间的平衡失当,Motion可能沦为另一个挂靠AI功能的Adobe套件。

一句话判断:Motion选对了细分赛道(动态设计品味、超细化编辑),但如果无法建立可复现的“品味引擎”,它最终会在“放权给AI”和“要求人工精修”的夹缝中,消耗掉自己的先发优势。

查看原始信息
Motion
Motion is a frontier video agent for tasteful motion design. Give it a prompt with links, X threads, videos, assets, or references. Motion researches, storyboards, and creates explainers, launch videos, logo animations, or motion design for existing videos. Then edit everything directly: resize, drag and drop, modify elements, or iterate with chat.

Hey everyone, my name is Adish and I'm one of the founders of Mosaic, the company behind Motion.

Motion is our new frontier video agent made for tasteful motion design. Here's how it works:

1. Give Motion a prompt — include any context like product links, X threads, YouTube videos, personal assets, and more. Motion will reference styles, incorporate research, and storyboard scenes.

2. Motion creates the video — it can orchestrate entire explainers, launch videos, logo animations, or even add motion design to your own talking head videos. Motion has taste in its visual animation.

3. Edit everything — resize, drag and drop, modify elements, or iterate with chat. Tell Motion to iterate on selections or add new elements until you’re satisfied. No need to re-generate entire scenes or deal with hallucinated artifacts.

To see what Motion can do, check out our launch video, which was made entirely with Motion.

Motion is in early access but we are giving everyone that finds us on ProductHunt free invite codes. Sign up for Motion today with access code phmotion.

For support, please jump into our growing Discord community and if you're interested in API access or end-to-end produced launch videos, please schedule a quick call.

7
回复

@adishj great work but make the flow smooth

1
回复

@adishj This feels like one of the first AI video tools actually focused on taste instead of just generation speed.
The editable workflow after generation is the most interesting part that's where most AI videos tools still break today.

0
回复

@adishj I've been looking for an ai tool that will generate motion designs that will help easy my content creation work load and this is exactly what I needed thanks guys

0
回复

Congratulations, Adish! Another awesome product by you!

2
回复

@zeng Thank you Zeng for being a supporter since day 1!

0
回复

every other video tool makes you start over if one frame is off. how granular is the editing.. like can you swap out a single element mid-scene or is it section level

2
回复

@tina_chhabra editability makes such a big difference — with something as visual as video, "almost" good enough is not good enough and the painful process of re-generating videos again and again just to run into more AI hallucinations burns a lot of time and credits. Motion solves that by giving you full editability at the element-level for every frame of your generated video!

0
回复

🔥🔥

2
回复

I have used this to edit my podcasts, trailers , Short videos perfect!! .. Saves a lot of time.. and very good at motion design!! Highly recommended.

2
回复

@gaurav_mahajan2 Thank you for giving us a go already and becoming an early adopter!

0
回复

Thrilled that Motion is finally out! We've seen some stunning videos being created from our beta testers, such as launch videos, product demos, and more. Join our Discord community for the latest updates.

1
回复

Motion is here for anyone who wants their videos to actually look good, not just “done.” 
I’m part of the team behind Motion. We built it around the real taste and workflow of motion designers and video editors, instead of just making another generic AI agent that edits videos.

Our goal is to capture that crafted, human motion-design feel in an AI-powered tool, so the results look like something a professional would actually be proud to publish.

We’re also very excited about what’s coming next. Soon, you’ll be able to choose exactly how much control you want: dial in every detail like a motion designer, or hand more over to Motion so you can ship high‑quality videos in bulk with minimal effort.

1
回复

The edit individual elements instead of regenerating the whole scene part is the unlock everyone's been waiting for.

Every other AI video tool turns into a slot machine...pull the lever, hope, repeat.

Curious about the taste claim though. That's the hardest thing to get right and the easiest to fake in a launch reel. Is there a style library it pulls from, or is it composing from scratch per prompt?

Grabbing an invite. If it can turn a product link plus a few bullets into a real explainer, it saves me a week.

1
回复

@midori_verity agreed — the editability post-generation is a fundamental difference between AI video generation and our internal and modifiable representation of video.

With something as visual as video, "almost" good enough is not good enough and the painful process of re-generating videos again and again just to run into more AI hallucinations burns a lot of time and credits. Motion solves that by giving you full editability at the element-level for every frame of your generated video!

Styling can be included directly in the prompt through specific instructions around brand guidelines or simply by dropping in your own website link and having it pull the design system from there. We also allow you to upload your own Design MD or provide other videos as style references.

Try it out at https://motion.so. Let me know how it goes for you!

0
回复

Looks super helpful! Love the builder and the ability to post on social in the same flow.

1
回复

@olivier_roth Thanks Olivier! Give it a try at https://motion.so. Access code for our ProductHunt launch: `phmotion`.

0
回复

So exciting to see Motion going live 🚀

Being part of the team behind it, the fast edits, huge amount of undiscovered use cases, and professional grade outputs are all things we're incredibly proud to share. Definitely give it a shot and be sure to hop into our Discord!

1
回复

Team GitHits was looking for a product exactly like Motion. We were lucky meet the team Mosaic at SaaStr. We love the product and will do our launch video with it.

1
回复

@jack_githits_com Thank you Jack for trusting us with your launch video. Can't wait to add it to the featured list here: https://motion.so/studio

0
回复

I used this to make my product videos, they turned out so well.
I saw increase in engagements.

1
回复

@kennydop Love to hear it, thanks for being an early adopter!

0
回复

love the little video!

1
回复

@louislecat Thanks! Made entirely with Motion!

0
回复

Been testing the tool for a while… really nice output!

1
回复

@kabilan_g Thanks for being an early adopter Kabilan!

0
回复

Why redirection from product hunt invalid access. Invite code failing

0
回复

Amazing as a content creator lacking the creative bone if you will this is perfect for me. I can create the artistic high quality motion graphics that other tools lack and I don't have to pay $1000's to have a graphic designer make them.

0
回复

The element-level editing is the right place to be opinionated. For product/launch videos, “tasteful” usually depends on constraints that are easy to lose: brand pacing, how much product proof appears before flourish, which animations feel off-brand, and which references are actually approved.

One thing I’d love to see is a small style-memory / “why this direction” panel: pulled from website, these references, these uploaded brand notes, then user corrections like “too kinetic” or “not enough product detail.” That would make the agent feel more like a reusable creative director than a one-off generator.

0
回复
#7
Chert
Build AI agents that text customers in iMessage
169
一句话介绍:Chert是一个让企业无需复杂开发即可在iMessage上构建和部署AI代理的平台,专注于客户服务、潜在客户捕获和主动外联,解决苹果官方缺乏商业API导致的高信任度消息渠道规模化利用难题。
Messaging API Artificial Intelligence
AI代理 iMessage客户服务 对话式商业 Apple Business Chat替代 CRM集成 外呼自动化 智能客服 消息通道基础设施 线索捕获 Mac中继
用户评论摘要:用户高度认同iMessage的高打开率与信任度,核心关切集中在:1. 技术可行性(无苹果官方API如何保障稳定发送与限流);2. 多平台支持(是否支持WhatsApp等);3. 功能边界(是仅入站还是支持外呼、是否支持线索预筛);4. 渠道定位(与SMS/RCS是替代还是互补)。创始人回复强调采用Mac中继+健康检查+设备分发策略保证可靠性,并支持双向对话。
AI 锐评

Chert精准切入了一个被多数AI客服厂商忽视的“黄金缝隙”——iMessage。当所有人都挤在网页聊天窗、WhatsApp和邮件中时,创始人Gary赌对了“高信任度+高打开率”这个商业公式。从评论区“iMessage not yet slop”的感慨可以看出,该渠道在商业消息尚未泛滥前,拥有远高于SMS的触达率与转化潜力。

产品价值核心并非“AI对话能力”——市面上ChatGPT套壳易做,难的是底层基础设施。Chert真正解决的是:苹果不开放iMessage商业API的物理约束。其采用Mac中继(即通过真实苹果设备路由消息)的方案,虽非全新,但集成了健康检查、跨账号设备分发、状态机跟踪等企业级功能,这在基础设施层面构建了护城河。对于依赖HubSpot、Close CRM的DTC品牌和家装服务公司而言,无需自研消息层即可将iMessage高效融入现有leads追踪流程。

值得警惕的是,其技术路线存在单点风险和可复制性。Mac中继方案的稳定性受限于苹果的软硬件更新及账号拉黑风险。随着苹果可能最终推出iMessage for Business API,当前的中继模式可能瞬间贬值。因此,Chert的窗口期在于“成为该生态被官方化之前的最大玩家”。目前来看,其先发优势和切入的“2B消息基础设施”定位,比直接做AI SaaS更具持久性。但能否从“蓝泡泡的管道工”进化为“智能对话平台”,取决于团队是否在CRM深度集成和场景化BPO方案上持续投入,而非停留在简单的prompt界面包装。

查看原始信息
Chert
Build and deploy conversational iMessage agents for customer service, inbound lead capture, and more. Simply configure the system prompt and tone, and you can create your own conversational iMessage agent for inbound handling, outbound follow-up, or whatever workflow you want to test. You can also integrate with CRMs like HubSpot, Close, or GoHighLevel to write back conversation histories.

Hi everyone, I'm Gary, co-founder of Chert!

We've spent the last six months building projects on iMessage for leasing companies, DTC startups, and home service agencies. Across these use cases, we kept seeing the same problem. Founders wanted to deploy conversational iMessage agents for customer service, lead capture, and outbound follow-up, but the underlying infrastructure was difficult to set up and hard to scale reliably.


This is what inspired us to build Chert, a platform for teams to create and deploy iMessage agents, powered by infrastructure that can send, receive, and automate conversations over iMessage at scale. There's a few main features that we believe will make Chert stable, reliable, and scalable:

- Chert provides comprehensive line health checks, making it safer and more robust for outbound use cases.
- Additionally, Chert offers a scalable pricing structure that lets teams scale to hundreds of lines and thousands of messages.
- Finally, Chert integrates with CRMs like Salesforce, HubSpot, and Close and tools like Vapi and Slack, so teams can easily add iMessage into their existing stacks at scale.

Feel free to try building and deploying your own iMessage agent through the Agents page in our website. No credit card required.

Would love any feedback or thoughts!

9
回复

@garygao this is actually a very smart angle.

everyone is building AI agents for email/web/chatbots, but iMessage is still weirdly underexplored despite insanely high engagement rates. feels like there’s a real infra opportunity here.

5
回复

@garygao Hi Gary, congrats on the launch. Any plans to add other messaging platforms (WA) later on?

4
回复

@garygao Congrats Gary, I really like how you approached this. It sounds like you kept seeing the same gap across different customer projects and turned that into infrastructure others can build on. I also like that Chert covers both the management layer and the API side, so it feels less like a one-off iMessage tool and more like a real channel teams can plug into their workflows.

0
回复

Congrats! This is cool. Are your customers typically coming from other iMessage tools and switching, or are they mostly introducing iMessage as a brand-new channel?

4
回复

@charlenechen_123 Mostly as a brand-new channel! Most of them actually use SMS/RCS before switching to iMessage!

0
回复

Do people usually use Chert to replace sms or have you seen circumstances where people use

both sms/rcs and blue-bubble messaging?

3
回复

@kumar_ritesh21 I've seen circumstances of people using both. A lot of times, people use sms/rcs for auth and transactional messaging and iMessage for 2-way conversational messaging

0
回复

@kumar_ritesh21 We usually see Chert used alongside SMS/RCS, especially when teams already have an existing SMS flow. The main reason people add blue-bubble messaging is that the response rate and perceived trust are meaningfully different, especially in consumer-facing conversations.

0
回复

Getting iMessage delivery working for AI agents is a non-trivial integration challenge since Apple doesn't expose a public API. We've wrestled with customer communication channel tradeoffs at RetainSure, and iMessage reach in B2C contexts is real. How are you handling message delivery guarantees and rate limits at scale when the underlying transport doesn't give you standard webhook callbacks?

3
回复

@anand_thakkar1 We have a lot of robust health checks in place that prevents rate limitations at scale. We also have custom systems built out that supports webhooks and exposes APIs!

2
回复

@anand_thakkar1 Yeah exactly, that’s the core challenge. Since there’s no clean public iMessage API or standard webhook layer, we treat delivery more like an observed state machine than a simple callback system.

On rate limits, we stay conservative and distribute sending across accounts/devices instead of trying to brute force volume. For delivery guarantees, we track message state from the transport layer, retries, failures, response behavior, and operator-visible logs. It’s definitely less deterministic than email or SMS, but that’s also why the channel works so well when handled carefully.

0
回复

@garygao Congrats on the launch! Really interesting product. Is there a way to try building an agent before talking to sales? I’d love to send a few test messages and see the developer flow end to end.

3
回复

@suyash_kr Yes, try out our agent builder, it's completely free and very easy to get started!

0
回复

Really cool. What are ur thoughts on trust and user experience? iMessage feels valuable rn bcs

its not yet flooded with business messaging slop.

2
回复

@sawan_kumar13 iMessage is a really high trust channel, and as long as people use it for 2-way conversational messaging and not spam, it'll remain as such!

0
回复

Do you support outbound or only inbound messaging?

2
回复

@vishalmehta8340 We support both warm outbound and inbound messaging!

0
回复

Big congrats on the launch. What does the integration actually look like? Is it mostly configuring the agent prompt and workflow in the dashboard, or do teams usually connect it into their existing stack too?

2
回复

@vishalmehta8340 Teams can connect it to their existing stack! We already support integrations in Salesforce, Hubspot, Slack, Attio, Close CRM, and more!

1
回复

@vishalmehta8340 Thanks! For most teams, the first version is just configuring the agent, workflow, and guardrails in the dashboard.

As they scale, they usually connect Chert into their existing stack so replies, leads, and status changes sync back cleanly.

0
回复

Could see this being really useful for businesses that get a lot of inbound interest. Are people using Chert to capture new demand right now or to mainly convert old demand they already missed?

2
回复

@jocky Both! There are companies using Chert to do lead reactivation and also others using Chert to do warm outbound!

1
回复
for someone who like a home services company, could the agent qualify the lead as well?
2
回复

@ann_y1 Yes! The agent can conversationally ask questions and qualify leads based on that!

0
回复

Hey everyone, thank you all for trying it out!

1
回复

Congrats on this launch!

1
回复

@peng_wood Thank you!!

0
回复

What is the difference between Chert and SMS/RCS other than blue-bubble messaging?

1
回复

@sololizard iMessage open rates are a lot higher and more conversational! Also, iMessage sends attachments for free, so there's no differences like SMS and MMS

1
回复

Love this! AI agents in a channel people already trust feels much more compelling than yet another app download. Can the agent proactively message users too, or is the product more focused on responding once a conversation begins?

1
回复

@ankit_rajput821 The agent can proactively message users as well!

0
回复

iMessage open rates are hard to beat. How does the underlying send mechanism work exactly? Apple doesn't have a public business API for iMessage - is this throught Apple Business Chat, a Mac relay, or something else?

1
回复

@christian_knaut We're doing it through a stable Mac relay!

0
回复

Why only iMessage? We recently had a client asking us to build a similar service, but connected to all popular messengers and social networks for handling customer requests, which would then also be sent to the CRM. We looked for a ready-made solution, but couldn’t find anything that supported all the platforms they needed.

1
回复

@natalia_iankovych We're going to also expand to other messaging platforms like WhatsApp very soon as well!

1
回复

Can I bring my own numbers?

1
回复

@mia_qiao Unfortunately, not right now

0
回复

Do you support running different prompts for different stages, like qualification vs scheduling?

1
回复

@thamibenjelloun Yes, we can support agents that handle both qualification and scheduling!

0
回复

Are the numbers automatically assigned or do I get to choose the number that I’m using?

1
回复

@min_zhou Numbers are usually automatically assigned, but we can definitely help if you have a preferred area code!

0
回复

Is the main product like a rest api, webhook, sdk, or integration?

1
回复

@alexia_li Our main product is currently an API for sending and receiving iMessages, but we also support integrations into popular CRMs and tools like Salesforce, Hubspot, and Slack!

0
回复

@alexia_li We support all of them!

1
回复

Interesting launch.

I saw that Chert targets businesses like DTC brands and SMBs on the webpage.

Is Chert more of a no-code agent builder or something built for developers?

1
回复

@bsy0221 It can be both! We have a no-code agent builder that only requires prompting and an API for developers who want to host their own agents!

0
回复

Hey, was reading through Chert's site and the iMessage-as-support-channel framing is honestly wild. one thing I wanted to understand, what's the path for non-iPhone customers, is there an SMS fallback or is the product fully iOS-only by design? that fork basically defines whether this is niche or omnichannel.

0
回复

@axlerodd Yes, we have an SMS fallback in place for non-iphone users! We're also planning on expanding to additional channels such as Whatsapp!

0
回复

Congrats on the launch! How would love to understand more how the inbound/outbound mechanism work? Is it basically an agent for each use case? Is the skill included out of the box?

0
回复

@hai_ta1 Agents can do both inbound and outbound! Also, inbound and outbound capabilities come directly out of the box via API endpoints!

0
回复

Building on iMessage means dealing with Apple's undocumented protocol quirks, and wrapping that behind a clean API is the real product. We've been building customer-facing AI agents at RetainSure and delivery reliability across channels is a constant headache. How do you handle delivery confirmations in iMessage? Does the API surface read receipts or are you inferring from response patterns?

0
回复

@retain_dev Yes, we can ensure high deliverability and know when a message is sent and delivered!

0
回复

The iMessage delivery angle is smart. B2B tools rarely nail async customer touchpoints, but buyers actually respond to texts. We're running customer success workflows at RetainSure and the biggest gap is getting responses to renewal nudges. CRM write-back via HubSpot is a nice touch. Does the agent handle multi-turn conversations well, or does it reset context between sessions?

0
回复

@dhiraj_patel5 Yes, the agent can handle multi-turn conversations! Also, your use case at RetainSure sounds like a perfect one for iMessage! Would love to talk more!

1
回复

Are you routing through registered businesses with iMessage for Business or solving it some other way? The leasing companies angle is sharp positioning. SMS open rates collapsed in the last couple years for those verticals and iMessage actually moves the needle here

0
回复

@artstavenka1 We're not routing through iMessage for Business right now

0
回复

Congrats!Amazing Product!!

0
回复

@gideon_ge Thank you!

0
回复

pretty cool! Congrats for the launch!
Is there any ways for the customer to talk to a human when ai is not efficient enough?

0
回复

@fberrez1 Yes! Human intervention and handoff is supported

0
回复

Are your customers typically coming from other iMessage tools and switching, or are they mostly introducing iMessage as a brand-new channel?

0
回复

@hanzhizhang0405 They're mostly coming to iMessage as a brand new channel. In fact, many of them switched from SMS/RCS to imsg!

0
回复
imessage agent is lit
0
回复

@hehe6z agree haha!

0
回复
#8
Monocle 3.5 for macOS
Noise-cancelling for your  screen
149
一句话介绍:Monocle 3.5 是一款 macOS 专注工具,通过模糊除当前活动窗口外的所有屏幕内容,帮助用户在多窗口环境下减少视觉干扰,提升工作与思考的沉浸感。
Productivity User Experience
macOS 专注工具 窗口管理 屏幕降噪 应用分组 多显示器支持 模糊效果 生产力工具 Stage Manager 系统级体验
用户评论摘要:用户普遍认可其概念与实现,尤其赞赏“应用分组”功能将单一窗口聚焦工具升级为多窗口工作流的日常驱动。有用户询问资源占用,开发者回应M3芯片闲置时约1.8% CPU。另有用户对“系统原生感”的宣称持谨慎期待。
AI 锐评

Monocle 3.5 的价值不在于“模糊”,而在于“精准的遮蔽”。它解决的不是“窗口太多”本身,而是人类视觉系统在多任务切换时的认知过载——这是很多效率工具(如单纯的窗口管理器或分屏应用)忽略的心理层面痛点。从3.0到3.5的迭代,开发者显然意识到,单一窗口聚焦只是一个“演示级”功能,只有“应用分组”才能让它真正进入多窗口工作流。这是从线性思维到网状思维的跃迁。

但需注意其技术代价:正如开发者在评论中坦承,模糊渲染的实时性意味着更高的CPU开销(尤其是在窗口切换瞬间)。虽然M系芯片下1.8%的闲置占用可接受,但在Intel Mac或重度负载场景下,这一体验可能打折。其“系统原生感”的宣称是一把双刃剑——无缝体验带来高粘性,但任何微小的UI卡顿都将直接击穿用户预期的底噪。

总体而言,Monocle 并非macOS的必需品,而是针对“信息敏感型”重度用户(写作者、设计师、开发者)的隐形眼镜。它不提供功能,只提供状态。这种“减法思维”在如今工具泛滥的环境里,反而是一种稀缺的清醒。

查看原始信息
Monocle 3.5 for macOS
Monocle 3.5 is a crucial follow-up to 3.0, and the one most asked for. With those long-awaited features finally in, the app just clicks. It feels complete.

Hey!

I'm Dominik, creator of Monocle, and I'm excited to share a new 3.5 update with you!

For those who don't know Monocle:

Think noise-cancelling, but for your screen.
Designed to actually feel like it came with your  Mac.

You know that feeling when you sit down to work and your screen is a wall of tabs, open apps, and windows you'll come back to later? Monocle softly dims all of it except the window you're using with a simple cursor shake.*

Nothing closes, nothing changes. The noise goes quiet enough to work, write, browse, or think again.

…just wiggle your mouse.

(*or a shortcut, menu bar click, whatever's easiest for you)

Some context:

3.0 was Monocle rebuilt from the ground up. It was the new foundation. 3.5 is what makes it actually click. Most of the features here have been on the community's wishlist since 1.0 launched, and getting them all in finally feels like the version I always imagined.

What's new in 3.5:

App Groups: Group two or more apps so they share focus. The whole group stays clear when any of them is active.

Stage Manager Support: Monocle now detects Stage Manager automatically and keeps your window strip visible alongside the focused app. No toggle, no setup.

Multi-monitor Blur, Finally Fixed: The long-standing bug where only one display would blur is gone. Multi-display setups are now fully supported, even with "Displays have Separate Spaces" disabled.

Corner Peek: Flick your cursor to any screen corner for a quick peek under the overlay.

Cursor Reveal Effect: A new way the overlay appears and disappears, fanning out from your cursor's position. Especially cool across multiple displays.

Blur-Free Mode: Turn blur off completely and keep just tint, grain, and monochrome. For those who prefer the classic dimmer look.

Plus a loooong list of polish: smoother auto-hide for the Dock and menu bar, license activation across user profiles on the same Mac, new scripting commands, settings UI/UX improvements, and lots of bug fixes throughout.


50% OFF discount:

As always, to celebrate the new release, I've hidden a generous discount as a new easter egg on the Monocle website. The sale is running until May 31, 2026. Good luck!


Latest mentions:

I tried the Monocle app for Mac and it completely changed how I handle digital distractions by blurring out everything except the window I am currently using by Tom's Guide (5/4/26)

Your Mac Is Missing All of These by Snazzy Labs (4/11/26)

10 Mac Apps That Will Change How You Use macOS in 2026 by MacRumors (12/24/25)


Oh, and make sure you read Monocle's Wall of Love, it's beautiful!
https://heyiam.dk/monocle/testimonials


Let me know your though :)

Dominik

3
回复

Using this for ages now. Love the concept and the implementation. Looks so beautiful that i am tempted to not keep app windows in full size 😅

0
回复

the app groups feature is the one that makes this actually usable for multi-window workflows - single window focus tools always broke down the moment you needed a browser and a doc open side by side. solving that is what takes it from a neat demo to a daily driver

0
回复

Why didn’t macOS have something like this from the start? :)

0
回复

@idahansen Right? Hopefully someday they take the hint :) ...or rather not? :D

0
回复

I hate when too many windows are visible :) I keep every window maximized fullscreen so the others aren’t visible at all - they distract me. Though I have a MacBook Air, so the screen is small. With large monitors, that’s probably not as convenient.

0
回复

@natalia_iankovych Makes sense on a smaller screen - fullscreen is probably the way to go there. On a larger display fullscreen just doesn't work for me. And honestly the blur is so freaking calming and satisfying, I just prefer this 😄

1
回复

the 'designed to feel like it came with your Mac' line is either accurate or it isn't and you find out in the first five minutes. that's the bar they've set for themselves which is either confident or reckless depending on the execution

0
回复

@ansari_adin 7-day trial, no card required - let me know :)

0
回复

I've seen your project at 1st launch I think. I find it very cool, congrats for the idea! I'm going to give it a try for sure

Question: I want something lightweight, do you have any stats on how much resources it takes when it is running? Or in the background?

cheers

0
回复

@fberrez1 Hey, thanks so much - really glad you've been following since the early days!

Monocle 3.5 is the most optimized version so far, but I still have one or two more rounds of performance work planned for the coming weeks.

That said, to set expectations: compared to traditional black overlay dimmers, the CPU impact is of course higher - especially during window switching, since Monocle is doing actual blur rendering.

When blur is ON and idle, I'm seeing around 1.8% CPU on my M3. When Monocle is running in the background while not active, it sits at 0%.

Hope that helps.

Dom

1
回复
#9
Voker
The Agent Analytics Platform for AI Product Teams
142
一句话介绍:Voker是为AI产品团队打造的智能体分析平台,通过轻量级SDK自动捕获用户意图、纠正和分辨率等行为数据,解决生产环境中智能体性能黑箱和监控缺失的痛点。
Analytics Developer Tools Artificial Intelligence
AI智能体分析 用户行为追踪 智能体性能监控 对话分析 自动意图识别 生产环境监控 智能体优化 SaaS工具 开发者工具 数据驱动
用户评论摘要:用户普遍认可自动意图/纠错检测的价值,关注点集中在:处理多意图会话和智能体间切换的能力、自定义指标和语义变体支持、追踪决策分支和性能回归、能否直接触发修复动作。部分用户怀疑“纠错”标注会受语义歧义和重定向场景干扰,要求降低误报率。
AI 锐评

Voker切中了AI Agent落地中最隐蔽的痛点:当LLM生成的对话流变成新的“黑箱”,传统日志和追踪工具只提供碎片化的调用栈,却无法回答“用户到底得到好的结果了吗”这个核心问题。它的“自动意图-纠错-分辨率”三层标注,本质上是在混沌的语义流中建立可量化的“好坏”标尺,这是比Token计数和延迟统计更接近业务本质的监控维度。

但产品目前的锋利程度,远不及它的洞察力。从评论中大量关于“多智能体切换”、“意图中途漂移”、“纠错语义词表训练”的追问来看,Voker的通用模型在面对高度定制化的生产场景时,很容易陷入“标注正确但对业务无意义”的尴尬。尤其是创始人坦承目前无法区分“成功交付”和“甩锅交接”之间的差异,这直接动摇了其在复杂业务流中的可信度——如果连核心指标“分辨率”都可能被污染,后续的优化决策就容易变成沙上建塔。

真正的价值壁垒不在于抓取了多少原始对话,而在于能否让数据解读与业务逻辑对齐。Voker提出“Amplitude for agents”是对的,但Amplitude价值的前提是用户事件是准确且结构化的。Voker目前更像是一个强大的LLM+启发式规则驱动的事件打标器,接下来要看它:一是能否让用户低成本纠正自动标注错误(反馈循环),二是能否通过API让Agent直接利用分析结果自我迭代。如果只提供了一层似是而非的仪表盘,那也只是另一种形式的“AI鸡汤”——看起来营养丰富,喝下去却未必能治病。

查看原始信息
Voker
Voker is the Agent Analytics Platform for AI product teams. It gives you the usage behavior and agent performance insights you need to monitor and optimize your production agents at scale. Install the lightweight, provider agnostic SDK and Voker handles the rest: automatic intent, correction and resolution detection on your user to agent interactions, conversation reconstructions, queryable timelines, agent performance tracking so you can build the best agents possible.

I’m Tyler - CoFounder of Voker, and I’m so tired of being disappointed by AI hype claims. I bet you are too.

I studied physics in college, and worked in data science, ML, and analytics until founding Voker. I’m a skeptical person by nature (I think it's the scientist in me) and my gut reaction to any technology hype is to be cautiously optimistic until I see things proven out in data.

I felt this way about LLMs when they first hit mainstream. I knew they had real potential applications, but was also worried about the lofty marketing buzz they were getting.

AI as an industry has written checks that individual builders are left to cash. Promising full automation, PhD-level intelligence, and perfect results. As someone who's skeptical of that narrative, I still believe agents can genuinely deliver, but only if teams are rigorous about measuring performance in production. Every website or product has Amplitude or PostHog for click and pageview analytics; a standard way to understand who's using it and how. Agents have no equivalent, so we built Voker.

We are the Agent Analytics Platform where you can:

- Monitor your agents
- Measure their performance
- See what users are asking
- Know for certain agents are delivering for your users
- Optimize based on real data

You install our SDK, and Voker collects your agent conversation data, automatically detecting:

- User intents (Book me a hotel in Vegas for next Saturday with a poolside view)
- Corrections (No, that room doesn’t have a poolside view!! TRY AGAIN)
- Agent resolutions (Tool Result: Room Booked... Success!)

These automated annotations are the foundation for building a holistic view of agent performance and user behavior in one analytics platform.

We asked 100+ AI founders, product managers, and agent engineers how they monitor their agents in production and the answer was resounding: by combing through individual traces (with the occasional evals sprinkled in). They all reported that they depend on customer complaints to tell them when agents are messing up. We feel strongly that there is a third leg of the agent monitoring stool missing - Agent Analytics.

You shouldn’t have to wait for users to complain to learn that a recent prompt change is breaking your hotel booking agent, or that the AI finance advisor you built is calling the wrong tool to look up realtime stock prices.

Turns out the antidote to AI hype is simple: measure your agents diligently, then iterate until you get it right.

Your users deserve better AI experiences (we all do)!

Install the Voker SDK on our free tier (up to 2,000 events/mo), and start building better agents today:
https://voker.ai/

21
回复

@tyler_postle Hey Tyler — congrats on the launch 👋

The "third leg of the agent monitoring stool" framing really resonates. I'm running a few agents in production myself (Telegram + VK Teams bots fronting an OpenClaw agent), and the gap I keep hitting isn't detecting that something went wrong - it's reconstructing why. Logs show the tool calls, but the model's reasoning between turns is gone unless I instrument it manually.

Quick question: does Voker capture the reasoning/thinking blocks, or just the user-facing turns and tool I/O? That's basically the line between "agent monitoring" and "agent debugging" for me.

Either way - good to see someone taking the analytics angle seriously instead of just shipping another eval framework. Will give the SDK a spin this week.

7
回复

@tyler_postle  Can Voker track performance regressions after a prompt, model, or tool change, and show whether success rates dropped for specific intents?

5
回复

amazing team building something that’s really needed! Congrats on the launch 💗

8
回复

@ay_ush Appreciate you guys! Autumn has made billing seamless - and for a small but mighty team like ours that's a huge time saver and value add. Esp love that you're purpose built for AI products <3

4
回复

Really cool that we can get an idea what people are using our agent for. The downside of having a powerful agent is that you don't always understand what people use it for and where it is not meeting expectations.

6
回复

@veskost I guess being great at building agent products is a double-edged sword! Appreciate all the feedback you and the team have given to us to help make Voker better. We see Lightfield as the north-star for a great Agent product experience!

4
回复

Oh this looks really interesting. How much of the setup is out of the box vs customizable?

6
回复

@mejackreed great question! for launch we focused on making it super easy to get started, so we invested lots of time in great out-of-the-box automated annotations and analytics. (We kept hearing that obs tools took TOO much time investment to get insights, so we wanted to solve that problem!)

Next up is custom metrics - so as you get more advanced with your analysis, you can go beyond our out of the box detections and analytics!

6
回复

Most observability tools treat agent calls as black boxes, logging tokens but missing the decision loop entirely. Building RetainSure's AI workflows, we struggled to attribute downstream outcomes back to specific agent choices. Our logging was ad hoc and we ended up rebuilding it multiple times. Does Voker capture branching decisions when an agent picks between tool calls, or is it focused on input/output tracing?

6
回复

@retain_dev Yes! Voker automatically tracks all the information your agent is provided to make its decisions, so you can see both the tools available and the tools used. This has helped our customers notice that their tools may need new descriptions when the agent has what it needs but isn't calling the right tool.

5
回复

Automatic intent and resolution detection is the right abstraction. Most agent monitoring tools just log tokens or latency, but you actually need to know if the user got what they came for. We're building AI-driven customer success at RetainSure and agent quality drift between deployments is a real headache. How does Voker handle cases where the user's intent shifts mid-conversation?

5
回复

@dhiraj_patel5 We're actually purpose built for complex, long running, multi-intent conversations! When our SDK detects multiple intents within a conversation, they get categorized into "Session Paths" that show up in our session timeline. This way you can easily navigate to different parts of the conversation without scrolling through the whole session. You can also analyze the accuracy of the agent on these separate intents across other surfaces in our product.

6
回复
What’s the feedback cycle? Can we launch other agents to fix issues?
5
回复

@lakshminath_dondeti Today most teams link the dashboards or screenshots to their coding agents to implement fixes. We're working on releasing Analytics APIs so your agents can directly query Voker, make changes to prompts/harnesses/code/tools and ship fixes on its own!

6
回复

Love the brutal honesty here AI has definitely written checks that devs are stuck cashing in production. Quick question on the SDK: how does it handle semantic variations for corrections? Will it catch things like actually scratch that versus no that's wrong out of the box, or do we need to train it on our own domain vocabulary?

5
回复

@vikramp7470 Good question - Voker will detect those kind of phrases, even with semantic variation. That being said, if you have super specific domain vocabulary, where two words might mean the same thing to a lay-person but not to you as a domain expert - then you will need to pass Voker some context in the form of either knowledge docs or feedback on our annotations (APIs for these are in the works!)

thanks Vikram!

8
回复

Congrats on the launch!
How does Voker handle intent attribution when the agent proactively redirects the user, say, a billing agent that detects the user is actually in the wrong product area and routes them elsewhere? The intent the user arrived with and the intent the agent resolved can diverge legitimately, and in those cases it's not clear whether that should register as a correction event or a successful resolution. Curious how the analytics model handles that distinction, since getting it wrong would skew correction rates significantly for agents designed to reroute.

4
回复

@binu_george Love this question. Today we don't have an explicit way to tie two agents together. We know this is critical because most scaled agent products have multi-agent handoff systems like you mentioned.

What we have customers do today is treat the handoff as a successful resolution.
Of course sometimes this is truly the resolution (in the case of an orchestrator agent for example ) but sometimes its actually just passing the buck.  We dont have a good way to differentiate these situations today, other than decoding the name and description of the agent its handed off to - in addition to any other information you send to us through our events SDK.

We absolutely intended to build direct features to support this pattern better because its very common.

Thanks for the question!

2
回复

Prompting Vibes definitely don't scale when agents start failing silently in production. Being able to catch tool errors before a client screams at us is a huge lifesaver. great job @tyler_postle

4
回复

@priya_kushwaha1 thanks Priya, you're not the first to tell us this, glad to see it resonates!

5
回复

How do you determine the quality of answers? I have an AI service with its own vector database. For almost any user question, we know the answer, provide tourist attractions, and we have more of them than ChatGPT. Will you be able to understand whether these are top-tier attractions or not?

4
回复

@natalia_iankovych When you send the information from your vector DB to your agent, Voker will also track that context. We'll use the information from your own RAG data to make our assessment on the quality of the response to the user! Essentially any information that your agent has to make its decision - Voker will also track and assess.

6
回复

Hey Tyler, went through Voker's site and the "Amplitude for agents" framing is honestly the cleanest take I've read on this gap. one thing I wanted to ask, how do you detect a "correction" automatically, is it sentiment delta on the next user message or something pattern-based? that label seems to do a lot of work in the product.

2
回复

@axlerodd good to know that "Amplitude for agents" resonated.

We detect corrections by processing user messages across multiple turns, and evaluate them within the context of the conversation and the original user intents that were detected. We use LLMs for language processing, and then we have a technique for hierarchical classification to categorize atomic annotations like intents and corrections into more general and insightful categories (you don't want to have to read a list of 1000s of corrections, you want a theme of "the agent is too happy" or "the agent claims it has tools it doesn't" )

Does that help? Maybe we should add better examples on our homepage?

1
回复

Do you also handle multi-agent, multi-turn orchestrations ?

2
回复

@raj_peko Yes! Our SDK is architected to ingest multi-agent, multi-turn orchestrations! That being said, we still have many more dashboards we want to add to make good use of that data. For example we're working on user journey visualizations so you can see how users get handed off to multiple different agents.

We also need to do more optimizing of our automated annotations (intents, corrections, resolutions) to make them even better for multi agent conversations (especially those where its not just a simple handoff, but a multi-player conversation).

1
回复
#10
Insights by Omnia
Step-by-step action plans to improve your AI visibility.
142
一句话介绍:Insights by Omnia 是一款将AI引擎可见性监测转化为可执行行动方案的工具,帮助内容与营销团队跳过繁琐的人工引用分析,直接获得优化AI搜索结果的分步计划。
Marketing Artificial Intelligence
AI可见性监控 SEO行动方案 内容优化 引用差距分析 营销自动化 品牌监测 多语言支持 按引擎定制 协作分享 数据分析
用户评论摘要:用户关心行动方案的跨团队可分享性(如含负责人标签)、是否按搜索引擎(ChatGPT vs Perplexity)定制、多语言博客的引用波动处理、可见性测量方法论(采样 vs 确定性)、以及能否明确区分“自建内容更新”与“外部引用获取”两类任务。
AI 锐评

Insights by Omnia 本质上解决的是一个行业通病:AI可见性监测的“最后一公里”困境。大多数工具止步于告诉用户“你在哪里被提及”,却把最耗时的引用分析和行动规划甩给了客户。Omnia 的差异化在于将“数据呈现”升级为“决策输出”,通过按影响等级、实施步骤和引用缺口拆解任务,试图弥合洞察与执行之间的鸿沟。

从用户反馈看,其核心价值已得到初步验证:针对不同AI引擎(如ChatGPT和Perplexity)定制行动方案,以及通过标签区分“自建内容”和“外部引用”两类任务,直击了多团队协作场景的痛点。特别是对多语言市场的引用波动处理,回应了全球化营销团队的实际成本——这让它超越了简单的仪表板工具。

但问题同样明显。产品依赖“长期监控数据”来提效,然而评论中提到的“引用来源每周波动”可能导致“周一的高优先级计划周五就失效”。如果行动方案无法动态适应这种高频变化,其“计划”价值会随时间快速衰减。此外,对外部来源的RAG框架引用展示虽有亮点(如Nike案例),但这类数据的具体性和获取难度依然是未知数——用户能否真正看到“哪些具体页面该添加品牌信息”,决定了“行动”是否只是伪命题。

真正的考验在于:当用户从“监测”转向“执行”时,Omnia能否证明其生成的行动方案(尤其是技术SEO和第三方外联部分)在实际SEO排名或AI引用率中产生可量化提升。如果只是将人工分析换了个UI打包,那它最多是效率工具,而不是增长引擎。目前看来,方向对了,但脆弱性尚存。

查看原始信息
Insights by Omnia
Omnia Insights turns any prompt you track into a prioritized action plan to improve your visibility in AI engines. Each plan includes impact level, why it matters, how to implement, and the citation gaps to close.
Hey Product Hunt 👋 Daniel here, cofounder of Omnia. We're back a few months after our last launch, when this community pushed us to #3 Product of the Day. Thank you for that, seriously. The feedback we kept hearing afterwards was the same: "Omnia shows me where I'm invisible in ChatGPT, cool. But what do I do about it?" So we're launching the answer: Insights. What it does? Pick any prompt you're tracking. Pick the engine where you want to improve. Omnia generates a prioritized action plan, built from your real citation data. Each action includes: ❓ Action type: owned content, third-party outreach, technical SEO, social, and more. 🎯 Impact level: how much this task is likely to move your visibility. 🧠 Why it matters: the reasoning, pulled from the real citation data behind the prompt. 🛠️ How to implement: concrete steps you can hand to a writer or dev. 🔍 Citation gaps: the specific sources citing competitors but not you, including the ones shaping sentiment around your brand. Why now? Monitoring AI visibility has become table stakes. Most tools, including ours until today, stop there. The next step is acting on it without spending 6 hours doing manual citation analysis per prompt. That's the gap Insights closes. Who it's for? Content and marketing teams, SEO leads, in-house marketers, and agencies running AI visibility for multiple brands. Anyone who's tired of staring at a dashboard wondering what to ship next. Huge thanks to everyone who gave us feedback after the last launch. Insights exists because of it. Drop questions in the comments, I'll be here all day 🙌
9
回复

@danimirror congrats on the launch! it looks an interesting tool!

0
回复

@danimirror Congrats on the launch Daniel. Nice tool

1
回复

@danimirror yeah, the 6h manual citation analysis is the actual cost — running this for a multi-language blog ourselves, the diff between english and non-english citation sets is where the hours go, not the visibility numbers. the thing i can't quite picture from the screenshots is how impact ranking handles citation volatility between weekly runs. cited sources drop in and out faster than the actions targeting them can be shipped, so a "high impact" plan from monday can read stale by friday — would explain how prioritization holds up week to week.

0
回复

Love this Guys!! Btw, does it track visibility down by country/market too, or is it focused on global signals for now?

1
回复

Hey@munis_abbas !

Yes, you can track prompts and generate per-country insights. Citations in the UK rarely match the ones in Spain or the US, so with Omnia you monitor and act on the specific market you select.

1
回复

Great work guys!

1
回复

This is cool!! How can I share these actions to other teams?

1
回复

Hey @matiszz ! That's a great one!
Every insight has a shareable button so you can send it to your other marketing teams: Content, SEO, etc.

0
回复

Let's go! Congrats on the launch 🚀

How long do I need to monitor a prompt before I can get an insight?

1
回复

Hey @gulipad ! Thanks for the comment!

You can generate insights as soon as you start monitoring the prompt, but it's not the best approach. The more monitoring time you have, the more citation data Omnio will have to work with, and the more powerful the insight will be.

0
回复

Love this feature!

What a great work from the whole team building it!

1
回复

What works for ChatGPT visibility doesn't necessarily work for Perplexity. Are the action plans tailored per engine? Congrats on the launch!

0
回复

Hey, went through Insights by Omnia and the brand-in-LLM-answers tracking is a smart wedge. one thing I wanted to ask, given LLM outputs vary call to call, how do you measure visibility, are you running sampled prompts on a schedule or do you have something more deterministic under the hood?

0
回复

That’s helpful — the tagging is exactly the piece I was hoping existed.

The next thing I’d watch is whether the insight carries enough “why this owner” context when it gets shared. A content person, SEO lead, and partnerships/PR person may all see the same citation gap, but each needs a different next action and confidence level. If the share/export preserves that owner-specific reasoning, it becomes much easier to hand off instead of just discuss.

0
回复

Strong direction — monitoring without a next action is where teams tend to stall.

One thing I’d look for in the action plan is a clean split between “create or update owned content” and “earn/repair external citations.” They usually have different owners, timelines, and success signals. Are you exposing that split clearly in the plan/export so content, SEO, and partnerships/PR can each pick up their part?

0
回复

Hey @jim_jeffers !
Yes! We "tag" each insight depending on what is the nature of it. Attaching here a few examples of them to show the differences.

0
回复

I’ve tested more than 10 similar services. Everyone focuses on internal website optimization, and very few show which data ChatGPT was trained on, even though external sources are the most important once the site itself is already optimized. Can you show external websites and specific pages on them that were included in the model’s training and where information about us should be added?

0
回复

hello@natalia_iankovych !
It's not only the training data, also the data in the RAG framework.

For instance, I'm attaching here a insight based on a source from a prompt of Nike(citation) from an external website. You can see it here: https://share.useomnia.com/shares/b38b5d07-5ee5-41ac-b125-e6529decccf6.html

0
回复
#11
imgproxy v4
A fast and secure self-hosted image processing server
135
一句话介绍:imgproxy v4 是一款自托管的图片处理服务器,旨在解决企业因使用第三方图片处理服务导致成本失控、性能受限和供应商锁定,或自建方案耗时易错的问题,能在不修改代码的情况下按需处理图片。
Software Engineering Developer Tools GitHub Tech
图片处理服务器 自托管 开源 图片缩放 性能优化 实时图片处理 Docker部署 企业级 API替代 成本控制
用户评论摘要:用户肯定了imgproxy“按需处理、避免资产反复准备”的理念;CEO强调自托管可规避第三方成本与锁定;开发者提问v4版本新增的内部缓存如何解决源图变更后的失效问题(是否支持ETag/Last-Modified),社区回帖指出不处理此场景,建议变更时换文件名。
AI 锐评

imgproxy v4 的定位很聪明:它不是在和自研系统拼灵活性,而是在和“SaaS烧钱陷阱”抢市场。其核心价值并非技术创新,而是“成本结构重构”——用一次性的OPS(运营)投入,取代持续增长的API调用账单。对于日处理百万级图片的中大型团队,这是典型的经济账战胜技术账的案例。

然而,产品在工程思维上存在明显的盲区:对缓存失效这类“脏活”采取回避态度,简单粗暴地要求“改文件名”。这在动态内容或CMS系统中将引入大量运维负担,更暴露了其“高速但不够智能”的技术底色。v4版新特性如RAW格式支持和图像分类算是不错的增量,但“内建缓存”的价值被评论区直接拷问,说明技术细节并未跟上营销话术。

综合来看,imgproxy是优秀的“降本工具”,但距离自称的“企业级解决方案”还有距离。真正的企业级意味着不仅要处理极限流量,更要处理复杂业务逻辑下的数据一致性。如果团队能接受“变更图片就得改名”的强约束,它确实是一款高性价比的开源替代品;否则,最终仍需在前端或CDN层做额外补偿工作,反而得不偿失。

查看原始信息
imgproxy v4
imgproxy is a fast and secure standalone server for resizing, processing, and converting images. The guiding principles behind imgproxy are speed, security, and simplicity. It is well equipped to handle huge volumes of image-processing requests with ease. imgproxy is a drop-in replacement for all the image-processing code in your web application. With imgproxy, you don’t need to repeatedly re-prepare images to fit your design every time it changes, as imgproxy does this on demand.

Hey Product Hunt! 👋

Sergei here, the author and CTO of imgproxy.

Almost every product I have been involved in has dealt with images in some manner. So I know the pain of making the right choice for image processing. Building your own image processing solution is a huge time sink, and it can easily be done wrong. On the other hand, using a third-party service quickly becomes a money pit as your traffic grows.

That's why we created imgproxy – a self-hosted image processing server that just works. It's packed in a single Docker image, deployable anywhere in minutes, with no third-party dependencies. Combine it with a caching layer of your choice, and you have an enterprise-grade image processing solution that can handle any load.

Since its first release back in 2017, imgproxy has been adopted by thousands of developers and companies worldwide. You probably see images processed by imgproxy every day without even realizing it: Medium, Dribbble, Substack, and many more use imgproxy to serve their images.

imgproxy comes in two flavors: a forever-free open-source version and a paid Pro version with additional features and support. The Pro version is available as a subscription and includes features such as video preview generation, object detection, SVG minification, and more. Both versions are self-hosted, so you have full control over your data and infrastructure.

Today, we are happy to announce the new major release of imgproxy, version 4.0. This release includes a lot of new features and improvements, such as:

  • Better performance thanks to parallel source image downloading and processing

  • Internal cache for processed images

  • Support for digital cameras' RAW formats

  • Image classification

  • Improved SVG minification

  • ... and much more!

Check out the full release notes for all the details: https://github.com/imgproxy/imgp....

Drop us a line in the comments if you have any questions or feedback. We are always happy to hear from the community!

8
回复

@darth_sim Hey, really liked the direction behind imgproxy.

The idea of handling image processing on demand instead of forcing teams to constantly re-prepare assets makes a lot more sense compared to how most applications still manage media pipelines.


Products like this can get infrastructure-heavy pretty quickly once request volume scales, especially around performance, caching, and distributed processing.


What’s been the most challenging part so far while building imgproxy?

0
回复

Hi PH!


Marina here, co-founder & CEO of imgproxy.


One thing we’ve noticed over the years is that more and more teams want to own their infrastructure again.
Not because self-hosting is trendy, but because image and video processing becomes surprisingly expensive and limiting at scale when it’s fully outsourced to SaaS platforms.


A lot of companies start with third-party image APIs because they’re convenient early on. But later they run into:

  • unpredictable costs

  • vendor lock-in

  • limited control over caching and performance

imgproxy exists because we wanted a different tradeoff: a fast self-hosted system developers can run anywhere and fully control themselves. And honestly, seeing companies process billions of images through imgproxy every day is still surreal to us.


Huge thanks to everyone who tested v4 early and helped shape this release.

8
回复

Hi PH! Viktor here, developer of imgproxy!

One thing we cared about from the beginning was building imgproxy as a sustainable long-term open-source project.

Why? Because the tools developers rely on most should be the ones they can actually own.

Not "own" as in pay for a license. Own as in run it yourself, on your own infrastructure, and trust it to do exactly what it claims. No hidden calls home, no lock-in, no surprises in your bill at the end of the month.

v4 is probably our most significant release in that spirit. We didn't just add features — we practically reworked the whole codebase. Cleaner architecture, better code quality, better performance. The kind of changes that don't make great screenshots but make a real difference when you're processing millions of images a day.

If imgproxy solves a problem for you, come say hi on GitHub. And if it doesn't yet — tell us why. That's how the best features get built

7
回复

With the new internal cache for processed images in v4, how does cache invalidation work when the source image changes? Does imgproxy check source ETags or Last-Modified headers, or is expiry purely time-based?

0
回复

@sunnyallan imgproxy doesn't handle this. Changing content without changing the filename is generally a bad idea. Even if you manage to make imgproxy and the CDNs handle this, there's also cache in the user's browser that you can't invalidate.

1
回复
#12
Haystack
Review the pull requests that actually need human attention
115
一句话介绍:Haystack是一款AI驱动的代码审查智能路由工具,它通过分析GitHub PR的代码差异、上下文、Agent轨迹和验证证据,自动将PR分流为“安全合并”、“需修复”或“需人工审查”,解决AI生成PR激增导致的人力审查过载和认知疲劳问题。
Developer Tools Artificial Intelligence GitHub
代码审查工具 AI代码审查 PR路由 开发者工具 工程效率 GitHub集成 Agent代码管理 代码合并自动化 DevOps 审查队列管理
用户评论摘要:用户普遍认同AI生成PR带来的“信号噪声比”问题,但关注点集中在:规则判断对AI的“良好判断力”依赖风险;是否支持多仓库及monorepo中不同服务(如支付vs内工具)的差异化审查规则;以及决定审查级别(安全/需修复/需人工)的关键信号是什么。团队回应称规则可配置、支持多仓库,并举例说明基于变更敏感性和验证充分性做判断。
AI 锐评

Haystack精准戳中了AI编程时代的核心痛点:当每个工程师都能日产20+个PR时,传统线级diff审查彻底沦为体力活。产品巧妙地将审查逻辑从“看代码改了什么”转向“验证代码是否达成了意图且有证据”,本质上是在为稀缺的人类注意力定价——让机器做机器擅长的合规检查,让人做人擅长的价值判断。

然而,产品成功的关键极其脆弱:它要求团队事先定义高度明确的规则集(“哪些变更触碰安全红线”“怎样的测试算充分证据”),而这本身就是高认知成本的工作。如果规则太松,AI将学会钻空子(如截个UI图假装过人工验证);如果规则太严,又会退回“所有PR都需人工”的老路。更深层的问题是,产品假设“意图与验证证据可被自动解析”,但当前AI agent产生的trace往往冗长且结构混乱,Haystack能否精确提取设计决策链,决定了它是真智能路由还是高级分类垃圾桶。

致命隐患在于,一旦开发团队习惯依赖Haystack自动放行“安全”PR,人类审查能力将加速退化——长期看,这会削弱团队对代码质量的整体直觉。此外,该产品本质上是在给GitHub PR系统打补丁,而非重构协作流。若GitHub或GitLab原生推出类似功能,Haystack的生存空间将迅速收窄。短期它确实是救火队长,长期需警惕沦为“AI代码的塑料瓶盖回收站”——表面上解决了流量问题,实际上只是把塑料瓶(低质量PR)压扁了再分类。

查看原始信息
Haystack
Haystack helps engineering teams manage the growing volume of AI-generated pull requests. It sits on top of GitHub, analyzes each PR’s diff, codebase context, agent trace, intent, and verification evidence, then routes it: safe to move forward, needs fixes, or needs human review. Teams use Haystack to keep code moving without rubber-stamping, focusing human attention where judgment actually matters.

Hey PH! We’re building Haystack to help teams deal with the explosion in the number of pull requests that need to be reviewed due to the rise of coding agents.

Haystack replaces the GitHub PR review system with a queue that triages each PR before a human has to read any diffs. It looks at the diffs, the codebase, and the coding-agent conversation that produced the PR. Haystack then routes it into one of three buckets:

1. Safe to merge. This means the PR has enough evidence behind it that the team can merge it without another human’s review.

Some examples:

  • A small UI copy change that includes a screenshot showing the final state

  • A backend change where the author clearly tested the important paths and ran the changes in a real environment

2. Needs fixes. This means that the PR has bugs or violates a rule in your codebase and therefore the PR needs to be fixed by the author.

Some examples:

  • The agent was asked to make loading a large table faster by adding pagination, but the PR still loads every result at once and “implements” pagination in the UI

  • The PR silently catches an error instead of logging, surfacing, or handling it. This violates the team’s “no silent error swallowing” rule

3. Needs human review. This means that the PR could not be sufficiently verified by the author or is touching a sensitive part of the codebase (determined by user-input guidelines) and thus requires human review.

Some examples:

  • The PR changes a significant amount of logic in billing

  • The PR changes an important user flow like onboarding, but the author only ran unit tests and never opened the app to check the flow end-to-end. That violates the team’s rule that high-impact user-facing changes need manual verification.

Instead of starting with line-by-line diffs, Haystack immediately tells the reviewer the goal behind the PR, what design decisions the author made (informed by their coding-agent conversation), and how much the author did to verify that the pull request works (e.g. run scripts, checked the frontend, etc.).

In this way, review shifts from “what changed?” to “is this the right behavior and is there evidence that it works?”.

Here’s a quick demo: https://www.tella.tv/video/strea...

We previously launched Haystack as a tool for understanding large PRs (https://news.ycombinator.com/ite...). As many of you can probably relate to, the release of Opus 4.5 completely shattered our conception of how fast an engineer could craft a PR.

And as coding agents got even better from 4.5, we realized that pull requests did not scale along with our coding velocity. With each member of our team being able to pump out more than 20 pull requests a day, code review quickly became cognitively exhausting and less helpful.

After talking with other folks, we learned many feel similarly, and currently face the binary option of either not doing review at all or trying to keep up with a fire hose of pull requests.

Haystack is our attempt at a third path. We still believe in code review, but as coding agents produce more code, human reviewer attention becomes more valuable and more expensive.

Haystack helps teams spend that attention on the PRs where a human can meaningfully change the outcome of that PR. And for such PRs, Haystack shows the reviewer what the PR intended to do, whether the author showed that it works, and what design decisions need a second pair of eyes.

We’re still quite early and are figuring out whether Haystack truly makes code review better. We would love any and all feedback!

3
回复

@akshay_subramaniam Congrats on the launch Akshay. Very cool but to my thinking, this requires either very good judgment (not an ai strong point) or very explicit rulesets. How do you deal with that.

1
回复

Does Haystack handle monorepo setups where different services need different review rules — e.g. stricter verification requirements for a payments service vs. internal tooling? Or is the ruleset currently global across all repos in a workspace?

0
回复

@sunnyallan The ruleset is per repo currently. If you have a monorepo with very different levels of verification needed, you can always specify "for payments, I want the author to prove that they did X" and Haystack will not apply this bar to internal tooling.

Alternatively, you can always ask for a human review for payments, and the reviewer can easily see "oh author did X, so I'm more confident this can merge, although I might want to take a closer look at exactly what they did".

0
回复

Hi. This is a real problem now - AI makes it easy to create PRs, but review time is still human.
I like the idea of routing PRs instead of treating every diff the same. What signals matter most for deciding “safe to move forward” vs “needs human review”?

0
回复

@ihorperkovskyi That's configurable by users! But, to give some examples:
1. The PR changes a significant amount of logic in billing. This might need a senior engineer who's familiar with the billing stack to get things right.
2. The PR changes an important user flow like onboarding, but the author only ran unit tests and never opened the app to check the flow end-to-end. That violates the team’s rule that high-impact user-facing changes need manual verification.
3. The author changes an agentic workflow, but does not do any A/B testing (e.g. replaying a situation or running an eval) to prove that their changes improve the product

0
回复

The signal-to-noise problem with PRs is real. Does it work across multiple repos or is it scoped to one at a time?

0
回复

@rich_nashawaty This works across multiple repos!

0
回复

Any reason as to your inspiration for naming it Haystack?

0
回复

@othman_katim The original product was meant to help users understand the relationship of different parts of their codebase e.g. how different functions worked together to allow for user authentication. This was because pre LLMs finding code relating to a semantic concept was like finding a needle in a haystack.

I guess you could retrofit the current product with the name by saying teams' PR queues look like haystacks and the needles are the PRs they need to review!

0
回复
#13
ShioriCode
Open-source alternative to Codex & Claude Code
109
一句话介绍:ShioriCode 是一个桌面端应用,为 Claude Code、Codex 等编码代理 CLI 提供项目感知的线程管理、活动时间线和差异审查界面,解决长时编码任务在终端中会话易中断、信息难追溯的痛点。
Open Source Developer Tools Artificial Intelligence GitHub
用户评论摘要:用户关心多代理间是否共享上下文与差异栈(是启动器还是抽象层),以及是否统一不同代理的工具调用格式。另有关注Elastic License原因,以及代理达到token限制时的连续性处理。官方回应通过原生压缩维持连续性。
AI 锐评

ShioriCode切中了一个真实但狭窄的痛点:编码代理CLI在长时、多文件任务中的会话管理几乎为零。它没有试图再造一个代码生成引擎,而是老老实实做“胶水层”——把多个CLI的碎片化输出整理成可浏览、可回溯的界面。这种定位聪明却也危险:价值完全依附于底层代理的表现与接口稳定性。当Claude Code、Codex们未来自身补上会话管理和可视化差异视图(几乎必然如此),ShioriCode就只剩下“整合多个代理”这一个薄弱的理由。目前评论中最关键的问题“是启动器还是抽象层”一语道破:如果它只是各自隔离的UI壳,换代理等于换工作记忆,那么协同价值微乎其微;而如果它能统一上下文和差异栈,则需要面对工具调用格式的标准化——这比它自己写一个代理还难。Elastic License的选择也暗示了开放性与商业化之间的摇摆:源码可用但不自由,社区贡献和商业竞品的张力未来将反复撕裂这个项目。此刻的ShioriCode更像一个精致的前端终端复用器,要成为真正的“编码协作层”,还需要在上下文持久化、代理间知识传递和差异归并上付出远超UI打磨的技术深度。

查看原始信息
ShioriCode
A desktop interface for the coding-agent CLIs you already use — Codex, Claude Code, Cursor, Gemini, Kimi, and a hosted Shiori provider. Run long-running sessions with project-aware threads, stream agent activity into a readable timeline, and review generated diffs without leaving the app. Source-available, built for work that doesn't fit in one prompt.
Hey Product Hunt, I'm Sami, maker of Shiori (the AI chat app). ShioriCode is a different project — a desktop interface for running coding-agent CLIs in real projects. The core problem: the agent CLIs I rely on (Codex, Claude Code, Cursor, Gemini, Kimi) are great, but my coding work doesn't fit in a single prompt. Long sessions die in the terminal, threads get crowded, diffs vanish into scrollback. ShioriCode keeps each agent run as a project-aware thread tied to a branch and workspace, streams activity into a readable timeline, and surfaces generated diffs without leaving the app. Pick whichever CLI you have authed (or use the hosted Shiori provider) and go. It's still early and source-available — sharp edges around setup, packaging, and provider support are the part your feedback would help most. github.com/shiorihq/shioricode — happy to answer anything.
2
回复

@samihindi Hey Sami, congrats on shipping 👋

The "desktop UI over agent CLIs you already use" angle is the right framing - agent CLIs are great at execution but terrible at session continuity once a task spans more than one coffee.

One question on the multi-provider story: when I switch between Claude Code and Codex inside the same ShioriCode project thread, what's shared and what isn't? Do they see the same diff stack and conversation history (re-injected as context), or is each provider its own isolated session that just happens to live in the same UI shell? Asking because the answer determines whether ShioriCode is a "launcher" or a real abstraction layer - very different products with very different moats.

1
回复

Hey Sami, went through ShioriCode's page and the "long sessions die in the terminal" line basically described my last week. one thing I wanted to ask, when you unify Claude Code, Codex, and Gemini in one workspace, are you normalizing the tool-call format under the hood or just rendering each one's native output? that abstraction call feels like make-or-break.

0
回复
The Elastic License is an interesting choice: https://github.com/shiorihq/Shio... Can you explain your thinking there?
0
回复

The project-aware threads plus streaming diff review solves a real problem. Agent CLIs have terrible UX for work that spans hours across multiple files. Building RetainSure, we kept hitting context window limits in long-running sessions and had to externalize state manually. How does ShioriCode handle continuity when an underlying agent reaches its token limit mid-session?

0
回复

@retain_dev We trigger the agent-native compaction once the token limit is reached :)

0
回复
#14
Starchild-1 by Odyssey
The first real-time multimodal world model
101
一句话介绍:Starchild-1是全球首个实时多模态世界模型,能同步生成音视频并实时响应用户输入,旨在为游戏、机器人、教育等场景提供沉浸式交互体验,解决传统AI模型无法实时、多模态协同模拟真实世界动态的痛点。
Robots Education Artificial Intelligence
实时多模态 世界模型 音视频同步生成 流式交互 交互式AI 游戏引擎 机器人仿真 教育科技 沉浸式模拟 认知计算
用户评论摘要:用户主要肯定其突破性——将世界模型从离线视觉扩展至实时音视频交互,并指出连续流式输入与音视频因果推进是关键价值。设计师建议需配备更直观的界面策略来降低交互门槛,体现对用户体验落地的关注。
AI 锐评

Starchild-1的“首个实时多模态世界模型”定位,确实击中了当前大模型一个关键盲区:绝大多数模型仍停留在文本或静态图像响应,即便有视频生成也多为离线“一刀切”。它所做的,是通过因果音视频联合生成,让AI从“看图说话”进化到“边听边看边聊”,这为游戏NPC的智能交互、机器人具身模拟、甚至动态教育内容生成提供了底层基础设施。

但必须泼一盆冷水:101票的冷启动数据说明这仍是一款极早期产品,技术报告很可能揭示其实时性依赖极高的算力或特定场景剪枝,距离“商业级可用”还隔着显存、延迟和内容一致性三座大山。评论中设计师提到的“界面策略”暴露了另一重风险——过度强调技术创新而忽视交互范式的重新设计,容易让用户陷入“它很厉害但不知怎么用”的尴尬。

其真正价值不在“更逼真”,而在“更低成本地模拟因果”——例如机器人训练中无需物理实体即可测试视听反馈回路。但若不能快速降低集成门槛,并开放API给游戏引擎和机器人中间件,它很可能沦为又一个“技术超炫、落地稀缺”的实验室展品。Odyssey团队需要尽快用具体的行业合作案例(比如与Unity集成、与ROS2打通)来证明,这不仅是demo,更是工具。

查看原始信息
Starchild-1 by Odyssey
Starchild-1 is the first real-time multimodal world model that generates synchronized audio + video while responding live to user input. Built for interactive AI, gaming, robotics, education, and beyond, bringing us closer to truly immersive world intelligence.

Really exciting to see Starchild-1 pushing world models beyond just visuals into real-time synchronized audio + video generation.

What stands out:

  • Real-time multimodal interaction instead of fixed offline clips

  • Continuous response to streaming user inputs

  • Audio-video causal rollout for more immersive simulations

  • Potential across gaming, robotics, education, healthcare & more

A big step toward more natural, interactive AI systems grounded in how the real world evolves.

Read more here: https://odyssey.ml/introducing-starchild-1

Technical report: https://starchild.odyssey.ml/starchild-1.pdf

P.S. I hunt the latest and greatest launches in tech, SaaS and AI, follow to be notified @rohanrecommends

2
回复

@rohanrecommends Real-time multimodal AI systems like Starchild-1 need an elite interface strategy to make continuous streaming user inputs feel completely intuitive.

As a Senior AI Product Designer, I love untangling these complex visual frameworks. Let's connect right here to talk shop! ⚡🚀

0
回复
#15
LearnHouse
The modern way to teach what you build
98
一句话介绍:LearnHouse 是一款专为开发者打造的开源学习平台,能够帮助你在产品内部快速创建并嵌入交互式课程,解决用户理解产品难、学习动力低的问题,实现“边用边学”。
Open Source Education GitHub Online Learning
开源学习平台 产品内教育 代码执行 AI教学助手 互动模拟 社区讨论 自托管 SaaS替代 开发者工具 用户激活
用户评论摘要:用户普遍认可“自托管+云服务”模式对数据自主权的保障。核心关注点集中在:1)AI对公式或专业领域内容的处理能力;2)沙箱化的多语言代码执行在规模化下的成本与架构(统一容器 vs 独立沙箱);3)希望将课程与产品版本变更动态关联,实现“活文档”式持续教育,避免课程内容滞后。
AI 锐评

LearnHouse 巧妙抓住了“软件即教学”的缝隙——多数团队要么依赖生硬的知识库,要么被臃肿的SaaS LMS绑架。v1.0 的亮点不在功能多,而在于将代码执行、AI问答、交互模拟这些原本属于开发工具的能力,以开发者熟悉的方式(开源+自托管+API)整合进教学场景。这比 Moodle 等传统项目更具现代感,比 Teachable 等商业平台更可控。

但问题也显而易见。第一,AI 的“上下文感知”目前更多是课程内容内检索,若扩展到产品API变更、版本差异、用户行为轨迹,才能真正让学习“活”起来,而这需要极强的工程集成。第二,多语言代码执行是双刃剑——既可以是“杀手级功能”,也可能因沙箱隔离复杂度与服务成本,在规模化后拖垮体验。评论中已有人质疑统一容器 vs 独立沙箱的设计取舍,这是技术底账,绕不开。第三,产品进化速度过快(19种语言、200+修复),但核心教学效果——用户完成率、知识留存率——是否优于视频+文档的组合,目前缺乏数据背书。这可能是早期团队最容易忽视的“盲区”:做的功能很多,但对“什么功能真正驱动了学习闭环”缺乏聚焦。

总体而言,LearnHouse 是一个针对开发者群体的“武器级”工具,但尚未脱离工具属性,距离一个“自带生态的学习平台”(如课程市场、讲师认证等)还有距离。其真正的突围机会在于:深耕 product-led growth 场景,成为API发布、版本发布等产品动作与学习内容动态绑定的基础设施,而非另一个静态的课程编辑器。

查看原始信息
LearnHouse
LearnHouse is an open-source learning platform to teach your users everything about your product. Built for builders who ship great things but need their users to actually understand them. v1.0 is our first stable release - packed with AI, code execution, discussions, analytics, and everything you'd expect from modern developer tooling. Self-host it or use our cloud. Either way, your content, your data, your community.
Hey Product Hunt! 👋 We're the team behind LearnHouse - and today we're shipping v1.0, our first stable release. Most LMS platforms are either locked-down SaaS that owns your data, or open-source projects that look like they were built in 2010. We wanted something that felt like modern developer tooling - beautiful, fast, and actually yours. LearnHouse is built for two kinds of people: builders who need to teach their users about what they've made, and educators who want a platform that doesn't get in the way. What's new in v1.0: 🧠 Context-aware AI for learners and educators 💻 Code execution across 30+ languages with auto-grading 🧊 Playgrounds - AI-generated interactive simulations 💬 Discussions built into every course 🎙️ Podcasts and audio learning 📊 Real engagement analytics 📋 Collaborative real-time Boards 🔒 SSO, SAML, Multi-tenancy, Audit Logs 🌍 19+ languages and 200+ fixes Self-host for free under AGPL, or use our cloud. We're a small team that's been heads down on this for a long time. We'd love to know - what would make this the LMS you've always wanted? Drop a comment 👇
6
回复

This feels especially useful for product-led teams where education is part of activation, not just “support docs.”

One feature I’d love in an LMS like this: tying lessons to product moments/version changes. Example: when a feature changes, show which lessons mention it, which users completed the old lesson, and what needs a small update vs. a full rewrite. That would make courses feel like living enablement instead of another stale content surface.

1
回复

Congrats on the v1.0 launch! The "self-host or use cloud, your content stays yours" stance is exactly what makes this kind of platform actually usable for serious course creators. I teach Excel for financial modeling on Udemy (https://www.udemy.com/course/exc...) and the biggest pain point with most LMS tools is exactly what you're solving — the ability to embed code execution and AI Q&A grounded in actual course material. That's the gap between "video library" and "students actually finish and apply this." Curious how your AI handles formula-heavy or domain-specific content vs. general explanations.

0
回复

Hey Badr, was reading through LearnHouse and shipping v1.0 after a long heads-down build is a respect-earning ship. one thing I wanted to ask, on the in-browser code execution across 30+ languages, is the runtime per-language sandboxed or you've got a unified container approach? that piece is usually the most expensive thing to scale on a learning platform.

0
回复
#16
AutoShelf
Auto-organize files on your Mac
95
一句话介绍:AutoShelf 是一款 macOS 菜单栏文件自动整理工具,通过预设规则一键清理下载文件夹等杂乱区域,解决用户手动分类文件的繁琐痛点。
Mac Productivity Menu Bar Apps
文件整理 macOS工具 自动化 菜单栏应用 规则引擎 图片优化 一次性付费 效率工具 文件管理
用户评论摘要:用户肯定其菜单栏监控和模板化规则设计;核心疑虑包括:多规则冲突时处理逻辑(是否按创建顺序或优先级)、是否支持不同文件夹独立规则集、能否处理嵌套子文件夹(目前仅顶层文件)。也有用户询问是否支持 Dropbox、iCloud 等云盘。
AI 锐评

AutoShelf 是一款定位精准但尚显 “青涩” 的效率工具。其核心价值在于将文件整理这一高频、低价值操作,降维成“设一次规则,终生托管”的自动化流程,切中了数字时代 “下载文件夹即无底洞” 的普遍痛点。19.99 美元一次性买断的定价,相较于同类 SaaS 订阅制产品显得克制且有诚意,降低了用户的决策门槛。

然而,从评论反馈来看,产品当前的功能完整度仍有明显短板:缺乏直观的规则冲突解决机制、无法按文件夹定制独立规则、不支持嵌套目录——这些并非锦上添花,而是文件整理场景下的基础刚需。开发者虽在回复中表现出积极迭代态度,但如果这些关键功能不能在短时间内补齐,AutoShelf 可能很快从 “尝鲜工具” 沦为 “半成品弃坑”。

产品真正的护城河不在于“自动整理”这一概念,而在于其边界的拓宽:支持多条件链式动作、文件格式自动转换、乃至对接云存储和 RSS 下载,如果这些规划路线图能顺利落地,AutoShelf 将从单纯的“文件清洁工”进化为“文件全生命周期管家”。但若只停留于解决“顶层目录”和“单一规则”的浅层需求,它最终难逃被 Hazel、Dropzone 等老牌工具或 macOS 自带的快捷指令取代的命运。一句话:方向对了,但需要跑得更快。

查看原始信息
AutoShelf
AutoShelf is a macOS app that watches your folders and auto-organizes files. Set a rule once and never think about it again. Free to try, unlock unlimited for $19.99.
Hey Product Hunt! 👋 I'm Orçun, the solo developer behind AutoShelf. I built AutoShelf because I got tired of my Downloads folder being a graveyard of DMGs, screenshots, and random files I'd never clean up. Existing solutions were either too complex, too expensive, or felt like they were built in another decade. I wanted something I could set up in 30 seconds and never think about again. AutoShelf is that thing. It lives in your menu bar, watches your folders, and follows your rules. Templates get you started in one click. Multi-condition rules handle the complex stuff. And it's a one-time purchase, no subscriptions. Why AutoShelf: - Radically simple onboarding — templates get you running in seconds - Modern native interface built for today's macOS - Image optimization built-in - One-time $19.99, no subscription What's coming next: - Multi-actions — chain move + rename + tag + archive in a single rule - Rule creation wizard — step-by-step guided flow, no guesswork - File conversions — auto-convert HEIC→JPG, MOV→MP4, WAV→MP3, and more - Cloud storage uploads — route files to iCloud, Dropbox, Google Drive - Auto downloads — fetch files from RSS feeds, URLs, or cloud storage on a schedule - Torrent downloads - fetch torrent files from RSS feeds and auto download & organize. Have an idea for a feature? I'd love to hear it → https://useautoshelf.com/support... Happy to answer any questions! Would love your feedback.
0
回复

Hey Orçun, was on AutoShelf's page just now and the menu-bar-watcher approach to file org is what pulled me in honestly. one thing on my mind, when two rules conflict on the same file, what wins, is it order of rule creation or a priority system? in my Downloads folder almost every file matches three rules at once.

0
回复

Always think if I'm bad with Mac files or if it's a disaster haha. This is a great idea to manage everything as you want.

does it work with cloud drives like Dropbox/iCloud too? BTW, Congratz for the launch

0
回复

Does AutoShelf support different rules per watched folder, or is it one global ruleset? My use case is Figma exports landing in Downloads alongside PDF invoices — ideally I’d route each file type to a completely different destination without them conflicting.

0
回复

Folder chaos on my Mac has been a constant battle. Does it handle nested folder structures or just top-level organization?

0
回复

@rich_nashawaty Good question! Right now AutoShelf only handles top level files within watched folders, so files inside subfolders aren't picked up. Nested folder support is definitely something I'm tracking though. I'm actively collecting feature requests and will be working on them for the upcoming version in next weeks. I'll share updates as things progress on my X acount. LMK if you have any other feature requests. Thanks for asking!

0
回复

Congrats man, been needing something like this lol my downloads folder is cooked

0
回复

@sezerufukyavuz Lol I know the pain. That was literally the reason I built this haha. Hope it saves your downloads folder! Let me know if you run into anything or have any feature ideas.

0
回复
#17
Hanami
A daily meditation with Japanese art
94
一句话介绍:Hanami 是一款每日清晨推送一幅日本艺术杰作(附有人声解说与文字背景)的极简冥想式APP,帮助用户在忙碌中建立片刻专注、从容的文化沉浸习惯,解决“碎片化焦虑”与“高雅艺术门槛”之间的体验空白。
iOS Art Education
冥想艺术 日本美术 日课 极简设计 文化教育 公众领域艺术品 人声播讲 无AI生成 iOS独享 个人开发者
用户评论摘要:用户关注“无AI生成”与厘清真迹归属的艰苦劳动,开发者回应作品来自大都会、芝加哥美术馆等公开馆藏,并人工交叉研究文化背景。另有用户建议提供文字替代音频,开发者确认APP内已配有文本介绍与词汇说明。
AI 锐评

Hanami 的真正价值不在于它是一款“艺术推送APP”,而在于它用一种极其克制的方式,重新定义了移动时代的精神消费逻辑。

在一片争抢用户注意力的红海中,Jun 选择反其道:只给用户一分钟,然后“ leave you alone ”。这种“有意的稀缺”是比内容本身更深层的设计哲学——它暗示高质量的审美体验并不需要无限Scroll,而恰恰需要在限定时长中凝练精华。此举既有心理学的敏锐(晨间仪式感促进习惯养成),也有消费伦理的清醒:无广告、无追踪、无AI生成,让“安静”本身变成可以量化的差异化卖点。

但冷静来看,产品仍有硬伤。一是内容深度依赖开发者个人研究,长期能否保持专业性与学术严谨性成疑;二是目前仅限iOS且无社交/社区机制,用户增长高度依赖口碑与App Store展示位,商业可持续性存在风险。音频+文字说明虽满足大部需求,但对于真正想深入学习某一画派、技法的用户来说,目前的“每日一幅”式浅尝略显单薄。

如果Hanami能围绕“艺术日课”切入轻度艺术学习订阅、策展人线上讲座、或是开放的策展人投稿机制,则有可能从“极简应用”蜕变为“审美教育基础设施”。但目前它更像一个漂亮、有态度的最小可行性产品,距离“改变人们如何面对艺术”还有一段不得不走的谋生路。

查看原始信息
Hanami
Hanami pairs you with one Japanese masterwork every morning from a thousand years of Japanese art history. The collection grows weekly. The whole experience is built to feel like a private morning visit to a museum — slow, intentional, no clutter. Each work comes with curator-voiced audio narration and quiet editorial context. Curated journeys move through movements: The Floating World (ukiyo-e), The Rinpa School, Kabuki Theater, Flowers Birds Stillness. iOS only. Built solo from Toronto.
Hi Product Hunt 👋 I'm Jun. I built Hanami solo over the past year from Toronto. You open it in the morning, spend a quiet minute with one Japanese painting — a Hokusai woodblock, a Hiroshige rain scene, a Yoshitoshi moon, a Jakuchū rooster — listen to a narrated note that opens up the work, its vocabulary, its history, the world it came from, and close it. The collection runs from Muromachi ink masters through Edo ukiyo-e into the Rinpa school and the Meiji-era prints of Yoshitoshi. I source from public museum archives and write the editorial notes myself, with help from art historians where I can get it. New works added weekly. A few things that mattered to me while building it: → No ads, no tracking, no sold data → No AI-generated art — every work is real, by a real artist, attributed → The audio narration is human-written, not generated copy → The whole thing is designed to be quiet — to take a minute of your morning, then leave you alone Hanami went live in the App Store this week and I'd love feedback. Happy to answer questions about the curation, the artists, the design, or why I think slow apps deserve more space.
3
回复

Hey Jun, spent a bit of time on Hanami's page and the "no AI-generated art" stance is what made me look closer. one thing I wanted to ask, how do you source attribution for thousand-year-old works, is it museum partnerships or public-domain curation with manual research? attribution at that depth feels like the hidden labor in the whole product.

0
回复

@axlerodd Hey, appreciate you checking it out. The artworks come from open-access museum collections (The Met, Art Institute of Chicago etc) where the scholarly attribution has been done by curators.

For each piece, I research the cultural context, historical notes, and Japanese vocabulary — cross-referencing museum records and art-history sources. Slow work, but the whole point is that someone actually thought about each piece.

This started as a way to reconnect with my Japanese heritage so building it has been as much for me as anyone using it.

0
回复

Really nice! It would be great to also have text instead of the descriptions being audio-only :)

0
回复

@jozzire_lyngdoh Thanks! There's actually text too — each artwork has historical context, cultural background, and Japanese vocabulary notes alongside the audio narration. The text sits below the artwork in the detail view. Audio is just one way in. Appreciate the feedback!

1
回复
#18
Papr Graph
Upgrade to graph-native vector embeddings
93
一句话介绍:Papr Graph 通过一个API调用将语义嵌入转化为图原生嵌入,解决了AI智能体在多跳查询、关系数据等复杂场景下因向量检索只重语义相似性而忽略正确性的痛点。
API Developer Tools Artificial Intelligence
图原生向量嵌入 AI智能体检索 向量数据库 语义搜索 图数据库 多跳推理 知识图谱 检索增强生成 词嵌入优化 MTEB
用户评论摘要:创始人Amir指出模型非问题,检索才是短板,图原生嵌入可编码时间、主题等信号提升正确性。用户质疑这是否只是语义结构化不好,并询问如何自动化理解不同垂直领域的上下文。
AI 锐评

Papr Graph的本质是在向量检索的“语义近邻”与“语义正确”之间架起一座图结构的桥梁。它不试图推翻现有嵌入模型或向量数据库,而是以一个轻量的、模型无关的插件形式,在检索链路中插入“图结构信号”。这种务实的设计逻辑值得肯定。

然而,产品演示中“阿司匹林”的例子揭示了根本矛盾:当语义相似度与事实正确性冲突时,Papr Graph所谓的“图原生嵌入”依赖于用户手动定义并编码“topic、time、intent”等信号。这本质上将检索的“正确答案”责任部分转移给了用户——你要先告诉你喜欢什么结构,才能获得结构化的正确。对于评论中“如何自动化理解各垂直领域上下文”的质疑,创始人的回应目前缺失,这是产品从demo走向生产的最大鸿沟。

此外,5-20%的MTEB提升数据很漂亮,但注意这只是在特定任务(coding, scifact, finance)上。对于通用检索场景,尤其是非结构化、噪声大的长尾数据,这种图信号的注入可能沦为过拟合。同时,“模型无关”的另一面是,Papr Graph需要你已有的嵌入在语义空间本身质量就够好——如果底层向量本身是乱码,加再多的图信号也是绣花枕头。

整体而言,Papr Graph解决了一个真实但窄的问题:让AI代理在已有向量检索基础上,更精准地理解“何时何地何种关系”下才该返回那条结果。但它的天花板也很明显——它不是一个独立的知识图谱引擎,而是向量检索的“强化补丁”。能否成为下一个基础设施,取决于它能以多低的成本、多高的自动化程度,让用户忘记“我该手动定义什么信号”这个前提。

查看原始信息
Papr Graph
Papr Graph transforms semantic embeddings into graph-native embeddings with one API call. It encodes temporal, topical, and other dimensions within any embedding, helping agents retrieve answers based on correctness, not just semantic closeness.

Hello everyone. I’m Amir, founder of Papr.

We built Papr Graph after seeing AI agents fail in production. The model wasn't the problem — retrieval was. Multi-hop questions, versioned policies, relational data — flat vector search breaks on all of it.

Vector search ranks by semantic closeness. But closeness ≠ correctness. A doc saying "aspirin reduces heart attack risk" and one saying "aspirin causes stomach bleeding" rank nearly identical — they're both about aspirin. For an agent making a recommendation, that's the difference between helpful and harmful.

Papr Graph is a graph-native embedding that sits between your existing embeddings and your agent. It encodes structured signals — topic, time, intent, entities, anything you define — directly into your embedding, so ranking reflects meaning in context, not just surface similarity. It's model-agnostic, works with whatever embeddings you're already using.

We saw Papr Graph improve existing embeddings on MTEB (coding, scifact, finance tasks) by 5-20%. On Stanford STaRK (MAG synthesized 10% dataset), Papr Graph leads retrieval models with 92% hit@5 accuracy.

Getting started is free. Keep your existing stack. Add our plugin. Drop graph-native ranking into your current retrieval flow with one API call.

3
回复

@amirkabbara Hi Amir, congrats on the launch. Isn't this more an issue of poor semantic structuring at the embed stage? Also, how do you automate the process of understanding the context of each vertical?

2
回复
#19
Thinnest AI
Build Voice AI Agents in 100+ languages for ₹1.5/min
89
一句话介绍:Thinnest AI 是一套可编程的语音AI基础设施,专为需要构建支持100多种本地语言、且能按分钟计费的AI电话客服与销售代理的企业设计,解决了现有平台语言支持差、账单货币不灵活、供应商锁定等痛点。
SaaS Artificial Intelligence Virtual Assistants
语音AI基础设施 多语言支持 电话代理 BYOK (自带密钥) 低代码/无代码 RAG知识库 MCP集成 印度市场 低价计费 Twilio/SIP集成
用户评论摘要:用户关注两点:一是产品是否仅限于印度市场还是可本地化至其他国家;二是MCP服务器是作为“语音产品”还是“语音基础设施”(即暴露流程给外部客户端,还是作为消费外部工具的客户端)。此外,有用户询问印地语-英语代码混用场景下的实时表现,创始人给出具体技术选型(Sarvam Saaras v3)回应。整体评论围绕本地化能力与基础架构开放度展开。
AI 锐评

Thinnest AI 的定位非常精准:它不是在跟美国巨头(如 Retell、Vapi)抢全球 API 市场,而是切入了一个被长期忽视但需求硬核的细分场景——印度本土化的企业级语音代理。其核心价值不在于“能做100种语言”(很多产品都宣称支持),而在于“把基础设施层彻底开放”。BYOK、自带SIP、自带STT/TTS,这对大型银行、BPO和保险机构是致命的吸引力——它们既不想被单一模型供应商锁死,又希望保留自己的合规计费、电话号码和API配额。1.5卢比/分钟(约人民币0.12元)的定价策略聪明,它把“语音AI”从按美元计价的昂贵黑盒,变成了按印度本地费率可轻松验收的成本项。这实际上是在教育一个价格敏感的长尾市场。

但挑战同样明显。目前评论集中在“MCP服务器边界”与“语言本地化”上,这反映了开发者对Thinnest到底是“平台”还是“工具”的担忧——如果它只是封装了一个低代码编辑器和RAG的“盒子”,那它本质上仍然是在用定制化抢占传统IVR集成商的生意,而非真正成为新范式的基础设施。关键在于,它能否通过MCP和开放SDK,让用户不仅能“搭建”语音代理,能“解构”并“嵌入”语音能力到自己的系统里(比如CRM、工单系统、支付回调)。此外,印度市场的合规、DLT注册、SIP互通协议在多家运营商间的碎片化问题,是比模型能力更棘手的地雷。Thinnest目前的杀手锏是靠Sarvam Saaras v3解决Hinglish混说场景,但如果后期用户规模上来,是否能保证不同TTS/STT配置在真实通话中的延迟和稳定性,还需要时间检验。

一句话总结:Thinnest AI 不是简单的“语音API”,而是一次针对印度企业级语音AI市场的“本地化基础设施集成”进攻——架对了,但打得通所有运营商的电话网,才是真正的护城河。

查看原始信息
Thinnest AI
ThinnestAI is programmable voice AI infrastructure for building human-like AI phone agents at scale. → 100+ languages with seamless multilingual conversations. → Bring your own LLM, STT, TTS, and telephony providers. → Native SIP, Twilio, and workflow integrations. → No-code flow builder with RAG knowledge bases and MCP support. → Real-time voice agents for support, sales, onboarding, and operations. Start building AI voice agents in minutes — no credit card required.
Hey Hunters 👋 I'm Ashutosh, founder of ThinnestAI. Six months ago I tried building a Hindi voice agent for a friend's NBFC. The platforms I tried either spoke broken Hindi, billed in USD with FX markup, or locked me into one model provider with no way to use the API keys I already paid for. So we built the missing layer. ThinnestAI is voice AI infrastructure : → Flat ₹1.5/min platform fee — INR billing. → 100+ Indian languages — native STT + TTS pipelines per language, not English-with-translation → Optional BYOK on OpenAI / Anthropic / Google / Deepgram / Sarvam / Cartesia / ElevenLabs / AssemblyAI — your keys, your provider invoice → Native Twilio + Vobiz SIP — your trunk, your number, your DLT registration → No-code flow editor, RAG knowledge bases, MCP server, REST + Python + JS SDKs → 100+ Tools and Integrations Target: product, support, and growth teams at Indian banks, NBFCs, insurers, BPOs, edtech, healthtech, D2C, and logistics. Free trial: 50 voice minutes + 200 chat messages, no card. Three things I'd love feedback on: 1. Pricing — is ₹1.5/min the right shape, or would teams prefer credit packs / annual prepay? 2. Language gaps — comment your language + use case and I'll prioritize the roadmap. 3. Missing integrations — we ship Razorpay, Slack, Notion, Google Sheets, n8n today. What would unblock you? Happy to extend trial credits for Hunters who share a use case — reply with what you're building. — Ashutosh
5
回复

@ashutosh_thinnest Congrats on the launch Ashutosh. Interesting tool. Could this be localized into different languages/countries or is your Tam just India?

1
回复

@ashutosh_thinnest 

Hey Ashutosh, congrats on launching 👋

The BYOK + MCP combo is the right call - most voice infra locks you into one provider. Curious: does the MCP server expose flows to external clients, or is the agent acting as MCP client consuming external tools? That's the line between "voice product" and "voice infra" for me.

2
回复

Will be watching this. We use a voice agent and will need to have it in other languages. Congrats on your launch!

0
回复

@midori_verity Thanks Midori, really appreciate it!

0
回复

Hey Ashutosh, went through Thinnest's site and the broken-Hindi-voice-agent story honestly hit home. one thing I wanted to ask, with BYO STT/TTS across 100+ languages, are there benchmarks on which Hindi-English code-mix setups hold up in production? Indian users switch mid-sentence and that's usually where these break.

0
回复

@axlerodd 
Good question — and you're testing the right thing. Mid-sentence switching is exactly where code-mix lives or dies, and it's a far better stress test than monolingual accuracy numbers.

The way we've set this up: for Hindi-English agents we run Sarvam Saaras v3, and that's a deliberate choice, not a default. Saaras v3 is trained code-mix-first - a million hours of Hinglish in the training set and built to hold word boundaries across the switch instead of forcing an utterance into a single language. That's the part that matters for the failure mode you named. A language-locked Deepgram or Whisper config, however carefully tuned, tends to break right at the switch points, because the model is being asked to work outside the distribution it was trained on.
On published code-mix benchmarks specifically — that's a genuinely under-measured corner of the space, happy to go deeper if useful.

0
回复
#20
calog.cc
Chat-based calorie tracker that actually knows desi food
87
一句话介绍:calog.cc 是一款基于AI聊天的卡路里追踪器,专门解决南亚饮食(如Roti、Qeema)在主流应用中无法准确记录的问题,用户只需输入或拍照即可获得精准的卡路里和宏量营养素数据。
Health & Fitness Productivity Artificial Intelligence
AI卡路里追踪 南亚饮食 饮食记录 健康管理 聊天式输入 拍照识别 宏量营养素 渐进式Web应用 减脂 免费工具
用户评论摘要:用户称赞其解决了南亚饮食难以追踪的痛点。有用户询问下载方式,开发者回应为PWA应用无需下载。另有用户建议增加低脂烹饪提示,开发者表示可对话获取建议,并会将此功能加入未来规划。
AI 锐评

calog.cc的价值不在于“又一个卡路里计数器”,而在于精准切入了一个被主流健身应用长期忽略的细分市场——南亚饮食文化圈。其“AI聊天式记录”本质是降低了文化差异带来的使用门槛:用户不用学习“一个馕=几克碳水”,只需说“吃了两张Roti”,AI便能自动解析。这种“即说即得”的体验,比手动从数据库大海捞针要高效得多。

然而,产品目前仍处于早期阶段(仅33个真实用户),其核心竞争力“AI对南亚食物理解的准确性”尚未经过大规模验证。如果用户在连续输入“Chicken Karahi”、“Daal Chawal”后,AI给出的估算值与实际差异过大,信任感会迅速崩塌。此外,PWA形态虽然降低了获客成本,但也意味着在手机原生功能(如健康数据同步、通知推送)上有所妥协。

真正的“锐意”在于:calog.cc没有试图做大而全的全球数据库,而是选择用AI模型去理解一个特定文化圈的食物。这种“小切口、深垂直”的打法,如果能通过用户反馈持续迭代模型精度,就有可能从“有趣的小工具”蜕变为“特定群体的刚需产品”。但若AI能力仅是调用通用大模型做关键词映射,没有针对南亚烹饪中“油、香料变量大”的特点做专门优化,那它依然只是个“看起来对口”的玩具。

查看原始信息
calog.cc
Most calorie apps fail with desi food. Search "roti" — wrong results. Qeema doesn't exist. Chai calories are way off. calog.cc fixes this. Type what you ate or snap a photo — AI estimates calories, protein and macros instantly. No food database. No forms. Just chat. Pre-loaded with Pakistani food. Tracks workouts too. Weekly fat loss chart to stay consistent. Free. No credit card. Works on mobile. Try without signing up https://calog.cc/try
Hey PH! 👋 Zair here, maker of calog.cc. I built this for myself. I'm Pakistani, working from home, trying to lose fat — and every calorie app I tried had no idea what I was eating. Roti, qeema, daal, chai — none of it tracked accurately. So I built my own. Chat-based, AI-powered, pre-tuned for desi food. You just type what you ate and get instant macros. No hunting through a food database. It's been live for a few months with ~33 real users. Still early, still improving. Today felt like the right time to share it more widely. Would love your honest feedback — especially from anyone who eats South Asian food and has struggled with calorie tracking. Ask me anything! 🙏
1
回复

Congrats on the launch! 🚀 Quick question: I saw a pop-up to download the app on my PC earlier, but it disappeared and I can't find the install option anymore. How can I download it now?

1
回复

@shivanshumishra Hey Shivanshu, thanks for checking it out! Since calog.cc is a Progressive Web App (PWA), that pop-up was just your browser asking if you wanted to install it locally.

If you are on Chrome or Edge on your PC, you can easily pull that option back up. Just look at the right side of your URL address bar. You should see a little monitor icon with an arrow in it next to the bookmark star. If you don't see that icon, click the three dots menu in the top right corner of your browser window and look for the 'Install calog.cc' option. Let me know if you manage to find it!

0
回复

@zairabbas Really cool seeing you build something from a personal problem first. Does the app also suggest healthier ways to prepare desi meals like using less oil or lighter ingredient alternatives or is the focus mainly on tracking calories and macros right now?

1
回复

@marc_du_plessis Hey Marc, thanks for the question! Right now, the core focus is purely on making tracking as frictionless as possible through chat, so it just logs whatever you tell it you ate.

However, you can actually use the chat to ask for advice. If you ask something like 'how can I make my chicken karahi lower calorie?' or 'what should I have for a healthy snack?', the built-in coach will give you suggestions based on your dietary settings and your remaining calories for the day. That advice mode doesn't log anything to your timeline, it just acts as a guide.

I really like the idea of proactive prep tips though, especially since desi cooking can hide a lot of sneaky calories in oils and ghee. I will definitely add that to the roadmap for future updates

0
回复

Congrats on the launch! Love this idea. I’m not originally from the UK and tracking food from my country has always been surprisingly difficult. Most apps just don’t get it. Really nice to see someone building for this.

1
回复

@nathalia_colling Thank you so much for the support, Nathalia! I completely get your frustration. Most major fitness apps are heavily catered to standard Western diets. The moment you eat a traditional or cultural meal, the entire tracking experience breaks down because you have to guess individual ingredients or look through a broken database.

While I pre-tuned this version specifically for South Asian and Pakistani food, the underlying AI chat model is actually pretty flexible with global cuisines. It is really encouraging to hear that this problem resonates outside the South Asian community too. Really appreciate you stopping by to share that!

0
回复

If you all don't want to sign up and just want to try it first, you can try it from this link:
https://calog.cc/try

0
回复