rawistt / README.md
walker11's picture
Upload README.md
0ee9702 verified

A newer version of the Gradio SDK is available: 5.35.0

Upgrade
metadata
title: RAWI Voice to Story Generator
emoji: πŸ“
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: 3.50.2
app_file: app.py
pinned: false
python_version: 3.9

RAWI Voice to Story Generator

This Hugging Face Space converts Arabic voice recordings into polished stories using Whisper for speech recognition and DeepSeek API for creative text generation.

How It Works

  1. Upload or record an Arabic audio clip
  2. The system transcribes the speech using OpenAI's Whisper model
  3. The transcript is sent to DeepSeek API to generate an enhanced story
  4. Both the original transcript and the generated story are displayed

Setup

This Space requires a DeepSeek API key to work properly. When deploying:

  1. Go to the Settings tab of your Space
  2. Add your DeepSeek API key as a secret named DEEPSEEK_API_KEY
  3. (Optional) If needed, change the API endpoint by adding DEEPSEEK_API_URL

Deploying to Hugging Face Spaces

To deploy this application to Hugging Face Spaces:

  1. Create a new Space on Hugging Face
  2. Select "Gradio" as the SDK
  3. Upload the contents of this directory to your Space
  4. Set the required secrets in the Space settings
  5. Choose a suitable hardware tier (recommend at least CPU-M)

Local Development

To run this project locally:

  1. Clone this repository
  2. Install dependencies: pip install -r requirements.txt
  3. Set environment variables:
    export DEEPSEEK_API_KEY=your_deepseek_api_key
    
  4. Run the application: python app.py

Technologies Used

  • Whisper: AI-powered speech recognition model
  • Gradio: Web interface for ML applications
  • DeepSeek API: Arabic text generation and enhancement

Note

This application is designed for Arabic language content. Using other languages may result in suboptimal performance.