By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: I taught Obsidian to listen and write my notes for me
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > I taught Obsidian to listen and write my notes for me
News

I taught Obsidian to listen and write my notes for me

News Room
Last updated: 2025/09/21 at 7:29 AM
News Room Published 21 September 2025
Share
SHARE

I like my Obsidian vault a lot. But NotebookLM made me realize that I could probably squeeze a lot more value out of my notes. So, I hooked a local LLM to my Obsidian vault—and it was amazing. That experiment opened the floodgates, and I started thinking about what else I could do to really maximize what Obsidian offers. You see the rabbit hole I’m digging for myself here: just one more integration, I promise.

One long-standing pain point for me has been the friction of starting a daily note. Voice journaling worked as a quick fix, but the resulting text was always unformatted and difficult to scan. What if I could voice journal directly in Obsidian and end up with a clean, formatted note instead of a messy transcript? Thanks to local LLMs and the abundance of free AI libraries, this is now possible.

My voice notes turn into fully structured entries

Raw audio to formatted notes

Before I dive into the how and what, here’s what the plugin actually does. It adds a record button to the Obsidian sidebar. I click it, and a prompt opens to record a voice note. After hitting Start, I speak my mind. Once I’m finished, I hit Stop. At that point, the audio is processed: first for transcription, then for formatting and summarization. When all of that’s done, the plugin generates a new note in the Voice Notes folder of my vault.

Each voice note contains an embedded audio player with the original recording, a summary, action tasks, key points, and the full transcript. The action tasks sometimes feel a bit overconfident, but overall, I like the result. It’s a fantastic way to quickly capture journal entries when I’m feeling lazy.

I already had openai/gpt-oss-20b hooked into Obsidian through a plugin called Private AI. That plugin supports RAG, which is great, but it doesn’t go beyond being a chatbot—it can’t create or modify notes. What I wanted was more automation.

The overall flow is obvious now: I click the record button, speak, and finish. Then, Whisper transcribes the audio. The transcription gets passed to my local LLM, which generates a summary and action items, and then that output flows back into Obsidian as a formatted note. Simple in concept, but surprisingly effective. A year ago, this would’ve felt impossible, but now we’re spoiled with user-friendly tools for running your own models. Below, I’ve written the gist of what I did. This isn’t meant to be a step-by-step programming tutorial so I’ll spare you the boilerplate and focus on the core idea instead.

The engine under the hood

LM Studio does the heavy lifting

Image by Amir Bohlooli. NAN.

I use LM Studio. I might switch to AnythingLLM at some point, but I started with LM Studio and I’m not keen on re-downloading models right now. LM Studio makes things simple: it runs a server with an API that mimics OpenAI, so it’s compatible with a wide range of services.

By default, the API is exposed at 127.0.0.1:1234/v1/chat/completions. I’ve experimented with different models, and gpt-oss-20b does a decent job. To balance speed with quality, I keep the reasoning effort set to low. This is necessary since I’m running all of this on an AMD RX6700XT 12GB. You have to cut corners when your GPU isn’t top of the line.

Setting up LM Studio is dead simple. Install it, pick and download a model, load it, and run the local server.

The voice-to-text layer

Whisper makes sense of the sound

Whisper web UI in a browser
Image by Amir Bohlooli. NAN.

Of course, before the LLM can summarize anything, I need a transcript. That’s where Whisper comes in. Whisper is open-source, powerful, and surprisingly fast depending on the model size. I couldn’t get it to run on my GPU, since AMD’s support in WSL is weak, but I decided to run it inside WSL anyway. Setting up WSL is straightforward—it’s just a few commands, and nothing gets messy since we’re only exposing the service.

I initially used Whisper-WebUI since this was my first time trying it out and I wanted to see it in action. There were some hiccups with Python dependencies (as usual), but I eventually got it working. The WebUI is built on Gradio and lets you upload an audio file and get a transcript, which is a handy way to verify the setup. I used the small model, which works well enough, though larger models improve accuracy if you need it.

Whisper API running in WSL
Image by Amir Bohlooli. NAN.

Eventually, the UI isn’t needed. If you’re confident, you can install the original Whisper repo directly. That said, the WebUI version also provides an API server if you wrap it in FastAPI, so either route works fine.

Obsidian ties it all together

Orchestrating the workflow

With Whisper transcribing and LM Studio summarizing, the final piece was making sure Obsidian could orchestrate the flow: sending audio to Whisper, feeding text to LM Studio, and creating the formatted note. Unsurprisingly, there’s no native way to do this. I didn’t even bother checking for community plugins—I knew I’d have to build my own.

Obsidian voice note plugin code
Image by Amir Bohlooli. NAN.

Thankfully, writing an Obsidian plugin is easy. Beyond the settings, there are two core functions: sendToWhisper takes the recorded audio and sends it to Whisper, waiting for the transcription. sendToLMStudio sends the transcription along with a prompt:

You are a helpful assistant that formats voice notes for Obsidian. Return valid GitHub-flavored Markdown only. Include:

– Title as first-level heading

– A concise summary

– A “Tasks” section with actionable items as Markdown checkboxes (- [ ] …)

– A “Notes” section with key points

– Then a “Transcript” section with the raw transcript below a heading

Do not add extra commentary.

The response is then caught, formatted into a new note, and saved. I speak, Obsidian processes, and I get a clean, structured note. It takes around ~40 seconds for it to transcribe a 4-minute voice note, which is fine, especially considering that I’m using my CPU only for the transcription. It’s not perfect, but it’s mine. The only drawback is that it all runs locally. I don’t have a homelab, so if I’m outside and using my phone, it won’t work. But … that’s a problem for future me.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article 5 Content Creator Tips for Aspiring Creators |
Next Article NIO vying for market share from Tesla, VW, and more with first mainstream SUV · TechNode
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Premier League Soccer: Stream Arsenal vs. Man City Live From Anywhere
News
15 Best GPTs for Research and Knowledge Discovery in 2025
Computing
Meta Ray-Ban Display vs. Rokid Glasses: Who’s Winning the Smart Glasses War?
News
The TechBeat: Making LLMs Efficient: Reducing Memory Usage Without Breaking Quality (9/21/2025) | HackerNoon
Computing

You Might also Like

News

Premier League Soccer: Stream Arsenal vs. Man City Live From Anywhere

10 Min Read
News

Meta Ray-Ban Display vs. Rokid Glasses: Who’s Winning the Smart Glasses War?

11 Min Read
News

The DJI Mini 4K drone is in-stock and on sale at Amazon for a record-low price — but there’s a small catch

3 Min Read
News

How to watch India vs Pakistan in Super Fours at Asia Cup 2025 — it’s *FREE*

5 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?