Post
1047
I've made an open version of Google's NotebookLM, and it shows the superiority of the open source tech task! ๐ช
The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks Meta's Llama 3.3-70B with writing the podcast script, with a good prompt crafted by @gabrielchua ("two hosts, with lively discussion, fun notes, insightful question etc.")
Then it hands off the text-to-speech conversion to Kokoro-82M, and there you go, you have two hosts discussion any article.
The generation is nearly instant, because:
> Llama 3.3 70B is running at 1,000 tokens/seconds with Cerebras inference
> The audio is generated in streaming mode by the tiny (yet powerful) Kokoro, generating voices faster than real-time.
And the audio generation runs for free on Zero GPUs, hosted by HF on H200s.
Overall, open source solutions rival the quality of closed-source solutions at close to no cost!
Try it here ๐๐ m-ric/open-notebooklm
The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks Meta's Llama 3.3-70B with writing the podcast script, with a good prompt crafted by @gabrielchua ("two hosts, with lively discussion, fun notes, insightful question etc.")
Then it hands off the text-to-speech conversion to Kokoro-82M, and there you go, you have two hosts discussion any article.
The generation is nearly instant, because:
> Llama 3.3 70B is running at 1,000 tokens/seconds with Cerebras inference
> The audio is generated in streaming mode by the tiny (yet powerful) Kokoro, generating voices faster than real-time.
And the audio generation runs for free on Zero GPUs, hosted by HF on H200s.
Overall, open source solutions rival the quality of closed-source solutions at close to no cost!
Try it here ๐๐ m-ric/open-notebooklm