March 2024

Launch HN: Aqua Voice (YC W24) – Voice-driven text editor
353 by the_king | 125 comments on Hacker News.
Hey HN! We’re Jack and Finn from Aqua Voice ( https://withaqua.com/ ). Aqua is a voice-native document editor that combines reliable dictation and natural language commands, letting you say things like: “make this a list” or “it’s Erin with an E” or “add an inline citation here for page 86 of this book”. Here is a demo: https://youtu.be/qwSAKg1YafM . Finn, who is big-time dyslexic, has been using dictation software since the sixth grade when his dad set him up on Dragon Dictation. He used it through school to write papers, and has been keeping his own transcription benchmarks since college. All that time, writing with your voice has remained a cumbersome and brittle experience that is riddled with painpoints. Dictation software is still terrible. All the solutions basically compete on accuracy (i.e. speech recognition), but none of them deal with the fundamentally brittle nature of the text that they generate. They don't try to format text correctly and require you to learn a bunch of specialized commands, which often are not worth it. They're not even close to a voice replacement for a keyboard. Even post LLM, you are limited to a set of specific commands and the most accurate models don’t have any commands. Outside of these rules, the models have no sense for what is an instruction and what is content. You can’t say “and format this like an email” or “make the last bullet point shorter”. Aqua solves this. This problem is important to Finn and millions of other people who would write with their voice if they could. Initially, we didn't think of it as a startup project. It was just something we wanted for ourselves. We thought maybe we'd write a novel with it - or something. After friends started asking to use the early versions of Aqua, it occurred to us that, if we didn't build it, maybe nobody would. Aqua Voice is a text editor that you talk to like a person. Depending on the way that you say it and the context in which you're operating, Aqua decides whether to transcribe what you said verbatim, execute a command, or subtly modify what you said into what you meant to write. For example, if you were to dictate: "Gryphons have classic forms resembling shield volcanoes," Aqua would output your text verbatim. But if you stumble over your words or start a sentence over a few times, Aqua is smart enough to figure that out and to only take the last version of the sentence. The vision is not only to provide a more natural dictation experience, but to enable for the first time an AI-writing experience that feels natural and collaborative. This requires moving away from using LLMs for one-off chat requests and towards something that is more like streaming where you are in constant contact with the model. Voice is the natural medium for this. Aqua is actually 6 models working together to transcribe, interpret, and rewrite the document according to your intent. Technically, executing a real-time voice application with a language model at its core requires complex coordination between multiple pieces. We use MoE transcription to outperform what was previously thought possible in terms of real-time accuracy. Then we sync up with a language model to determine what should be on the screen as quickly as possible. The model isn't perfect, but it is ready for early adopters and we’ve already been getting feedback from grateful users. For example, a historian with carpal tunnel sent us an email he wrote using Aqua and said that he is now able to be five times as productive as he was previously. We've heard from other people with disabilities that prevent them from typing. We've also seen good adoption from people who are dyslexic or simply prefer talking to typing. It’s being used for everything from emails to brainstorming to papers to legal briefings. While there is much left to do in terms of latency and robustness, the best experiences with Aqua are beginning to feel magical. We would love for you to try it out and give us feedback, which you can do with no account on https://withaqua.com . If you find it useful, it’s $10/month after a 1000-token free trial. (We want to bump the free trial in the future, but we're a small team, and running this thing isn’t cheap.) We’d love to hear your ideas and comments with voice-to-text!

Show HN: Memories – FOSS Google Photos alternative built for high performance
677 by radialapps | 200 comments on Hacker News.
Memories is a FOSS Google Photos alternative that you can self-host (it runs as a Nextcloud plugin). Website: https://ift.tt/qW7Ocad GitHub: https://ift.tt/Za4ztPu Demo Server: https://ift.tt/Mzn25XR (demo runs in San Francisco on a free-tier cloud vm) Memories has been built ground-up for high performance and is extremely fast when configured correctly. In our testing environment, it can load a timeline view with 100k photos in under 500ms, including query and rendering time! Some features to highlight: * A timeline similar to Google Photos where you can skip to any time in history instantly. * AI-based tagging that runs locally on your server, identifying and tagging people and objects. * Albums and external sharing. * Metadata editing support * A world map of your photos, supported both on mobile and the web * Did I mention it's extremely fast? Would love to hear feedback from the HN community! :)

Ask HN: Do you also marvel at the complexity of everyday objects?
419 by parpfish | 294 comments on Hacker News.
A few weeks ago I was doing some soldering and I started using a spool of insulated 22-gauge wire. Maybe it was the solder fumes, but I started thinking about what it actually took to create that spool of wire -- everything from the geologists and miners extracting ore, through all the metallurgy, industrial engineering, and plastics work. And I started to marvel at all the work and expertise it took to make something that I normally would've just considered a semi-disposable consumable item. It made me wonder whether that spool of wire was actually a piece of technology on par in sophistication with all the software that I build every day. It was such an odd moment, but it's has caused a lasting perspective shift. almost every day I'll look at some commonplace object I took for granted and think "this is actually so complex, no single human has all the knowledge or expertise to create it". I'm curious if anybody else has had a similar experience and/or what are some simple everyday objects that give you pause when you stop to think about their complexity

MKRdezign

Contact Form

Name

Email *

Message *

Powered by Blogger.
Javascript DisablePlease Enable Javascript To See All Widget