智能助手网
标签聚合 Assistant

/tag/Assistant

linux.do · 2026-04-18 15:12:59+08:00 · tech

最终搞定了,记录一下。 在 Home Assistant(简称HA)里安装 xiaomi-miot 插件后。 设置 → 设备与服务 → 搜索 Xiaomi Miot 在设备列表里找到你要控制的音箱,我有2个小爱音箱,其中 小米小爱音箱 Pro ,在 Miot 里叫做 Mi AI Speaker Pro,设备型号 xiaomi.wifispeaker.lx06 小米智能音箱 Pro ,在 Miot 里叫做 Xiaomi Smart Speaker Pro,设备型号 xiaomi.wifispeaker.oh2p 点击进入设备控制页面后,可以发现有很多控制功能,比如【播放文本】和【执行文本指令】。 点击“播放文本”功能前面的图标,再点击右上角的设置图标 即可看到这个功能的【实体标识符】,我理解就是一个 function id 有了这个,再配合 HA 生成的长期 token,就能写脚本控制小爱音箱说话了。 这是龙虾给我写的代码,测试通过。 1 个帖子 - 1 位参与者 阅读完整话题

linux.do · 2026-04-17 21:44:45+08:00 · tech

前言 了解到这个是通过ACP协议来实现本地Claude Code与IDEA进行通信的,不知道和站内大佬开发的CC GUI 【开源自荐】IDEA版 Claude Code GUI 插件(v0.2) 有什么能力上的不同,个人感觉主要功能两者都有,而且大佬的CC GUI还能统计token消耗、一键commit。有对比过的佬友可以评论区留留言。 参考环境配置 Win11系统(Mac OS/Linux等系统也可); IDEA 2026.1(其他版本未知); 本地已安装Claude Code; 本地已安装CC Switch; 已有订阅的大模型api密钥; 安装步骤 运行 pnpm install -g @zed-industries/claude-code-acp ,如果没有先 npm install -g pnpm 再运行 pnpm setup。 重开一个终端运行 pnpm bin -g ,找到 claude-code-acp 的所在路径,一般在 C:\Users\你的用户名\AppData\Local\pnpm 文件夹下。 IDEA打开AI Assistant插件(无需激活只需安装),点击添加自定义智能体。 在 acp.json 中全选并复制粘贴。 { "default_mcp_settings": { "use_idea_mcp": true, "use_custom_mcp": true }, "agent_servers": { "Claude Code": { "command": "C://Users/你的用户名/AppData/Local/pnpm/claude-code-acp.cmd"/*, "env": { "CLAUDE_CODE_GIT_BASH_PATH": "D:\\Git\\bin\\bash.exe" }*/ //这一段是我的git的bash.exe的路径,我加了这段才能运行不报错,原因不详,佬友可参考 } } } CC Switch中配置好大模型api密钥,以GLM为例。 大功告成。 疑难参考 如果本地claude code跳登录,可以在 .claude.json 最外层大括号中添加 "hasCompletedOnboarding": true, ,这个文件一般位于 C:\Users\Administrator 下。 1 个帖子 - 1 位参与者 阅读完整话题

hnrss.org · 2026-04-16 16:31:06+08:00 · tech

Hey HN, I built Emailbottle because I wanted AI help with my email without giving an app full access to my inbox. You get a personal email address (like [email protected]). Forward an email to it with an instruction like “summarize this” or “pull out the action items,” and it replies. You can also ask it to create calendar events (it sends back .ics files), set reminders (it emails you at the scheduled time), or draft replies. Since it’s just email, it works with Gmail, Outlook, Apple Mail, or whatever you use. No plugins, no extensions, no OAuth permissions. And because you forward individual emails rather than connecting your inbox, Emailbottle only ever sees what you explicitly send it. It also supports conversation threads. You can reply to Emailbottle’s response to ask follow-up questions or refine what it gave you, and it remembers the context. Would love any feedback on the product or the approach. (Disclaimer: This is a relaunch with improvements based on earlier feedback.) Comments URL: https://news.ycombinator.com/item?id=47790270 Points: 3 # Comments: 0

hnrss.org · 2026-04-14 01:05:09+08:00 · tech

I am working on Hitoku Draft. An open-source, voice-first AI assistant that runs entirely locally. No cloud models, nothing leaves your machine. You press a hotkey, and you talk. It's context-aware; it reads your screen, documents, and active app to understand what you're working on. You can ask about PDFs, reply to emails, create calendar events, use web search, all by voice. It supports Gemma 4 and Qwen 3.5 for text generation, plus multiple STT backends (Parakeet, Whisper, Qwen3-ASR). Examples: - Gemma4 in action, https://www.youtube.com/watch?v=OgfI-3YjEVU - query a pdf document, https://www.youtube.com/watch?v=ggaDhut7FnU - reply to email, https://www.youtube.com/watch?v=QFnHXMBp1gA - and the usual voice dictation (with optional polishing) I currently use it a lot with Claude Code, Obsidian and Apple Notes, or just read papers. Code: https://github.com/Saladino93/hitokudraft/tree/litert Download of binary: https://hitoku.me/draft/ (free with code HITOKUHN2026) I am looking for feedback. My goal is to do AI research with clients interfacing, and I thought this is a nice little experiment I could do to iterate/fail quickly. P.S. (if anyone has tips about this) Current Gemma4 implementation (with small models) has some problems: - easy to hallucinate for long contexts, so had to reset it often. Tuned some parameters, but need to find a sweet spot. - Gemma4 with LiteRT is currently fast compared to the MLX implementation of Qwen3.5 (like 3x faster on my machine when dealing with images). But it has the price of memory spikes. I believe this is because LiteRT's WebGPU backend can allocate significantly more GPU memory than the model weights alone (I got 38GB of memory taken, for the E4B~4GB model!). I guess we need to wait for Google for this. - App size: because no official Swift package from Google yet, have to bundle some file (LiteRT dylibs) that adds ~98 MB to a previous MLX only version (total app goes from ~50 MB to ~150 MB) If any of this bothers you: use Qwen 3.5 instead (pure MLX), or wait for the upstream fixes from Google :) Otherwise, for the mid-term I plan to switch to a potentially slower, but safer, MLX version for Gemma4 (hopefully on the weekend). Comments URL: https://news.ycombinator.com/item?id=47755000 Points: 2 # Comments: 0

hnrss.org · 2026-04-12 19:48:16+08:00 · tech

Hi HN, We built Sova AI https://ayconic.io/sova , an Android assistant agent that actually controls and operates your apps. It's not a chat and not another LLM wrapper. We were incredibly frustrated with the current state of mobile AI. Built-in assistants like Gemini are deeply integrated into the OS, yet if you ask them to "Order an Uber to the airport", they mostly just give you web search results or a button to open the app yourself. They don't do the work. (The Perplexity "assistant" is just a browser agent :/ ) So, we built an agent that does operate your phone. (NO root, NO adb, NO PC, NO appium/whatever, NO usb, NO browser) How it works: You give Sova a prompt - either voice or text, you can make it a default assistant if you like. Instead of relying on non-existent official app APIs, Sova acts as a virtual human - clicks, scrolls, types etc. It uses the Android Accessibility API to read the screen's UI node tree. About AI models - currently we support main AI cloud providers (OpenAI, Gemini, Anthropic, Deepseek etc etc) and working towards support of local AI models on your host - Ollama, LM studio, etc. Pricing: 100% Free / Bring Your Own Key (BYOK) We aren't charging for the Sova engine right now. We built a BYOK system: you plug in your own API key (OpenAI, Claude, whatever you prefer), and you only pay the provider for the tokens you use. We figured out how to do this entirely on-device as a standard Kotlin app. No tethering to a PC, no Appium, no Root, and no Shizuku/ADB workarounds. Just an app even your granny can use. The Google Play Ban: Because we use the Accessibility API for "universal automation" (literally mapping and clicking other apps), Google Play rejected our submission. It’s ironic: they banned us for building the exact agentic behavior that Gemini promises but fails to deliver. So, we are hosting the APK ourselves: https://sova.ayconic.io We’d love for you to download the APK, plug in your key, and try to break it. What apps completely confuse the agent? Roadmap: support of local models with Ollama, LM studio or another tools, predefined rules and personas for your tasks, detailed statistics for you, support for Openrouter, enterprise Amazon Bedrock, Google Vertex and Azure Foundry models, support for IOS. What would you like to see more? We'd be happy to hear your feedback, success and failure stories. Video demo is here https://www.youtube.com/watch?v=r-x6hRmtBy0 and APK is here: https://ayconic.io/sova We are here to answer your questions and listen to feedback in Telegram and Discord. It's not perfect yet, but it does its work. Comments URL: https://news.ycombinator.com/item?id=47738583 Points: 2 # Comments: 1