@onekq on Hugging Face: "Open source models are immutable, this is a big pain. When you open source a…"

24 days ago

•

You are right, not many people are speaking, even though it is so busy community according to the download numbers. Now I must tell you I was using your OneSql model, but it seems that Microsoft Phi-4 is just fine. It's giving me all the good results according to me. So I'm not switching that often to one SQL to try it out.

There are no fixes with the model but there is fine tuning and you are right free software can be taken by anybody and used however they wish and want.

I think that SQL responses can be implemented better, integrated better into the workflow. For example, it could be implemented directly in the command line tool called PSQL.

Here are some examples of what I did today.

📊 RCD Notes: Daily Average Spoken Pages

📊 RCD Notes: Daily Average Spoken Pages:
\pset border 2
\pset linestyle unicode

  WITH word_count AS (
    SELECT 
      speech_datecreated::date AS date,
      SUM((LENGTH(speech_text) - LENGTH(REPLACE(speech_text, ' ', '')) + 1)) AS total_words
      FROM 
        public.speech
     GROUP BY 
       speech_datecreated::date
  )
  SELECT 
  date,
  total_words,
  ROUND((total_words::numeric / 500), 2) AS average_spoken_pages
  FROM 
  word_count
  ORDER BY 
  date;

Output

┌────────────┬─────────────┬──────────────────────┐
│    date    │ total_words │ average_spoken_pages │
├────────────┼─────────────┼──────────────────────┤
│ 2025-03-05 │        2011 │                 4.02 │
│ 2025-03-06 │        2391 │                 4.78 │
│ 2025-03-07 │         834 │                 1.67 │
│ 2025-03-08 │         286 │                 0.57 │
│ 2025-03-09 │        1142 │                 2.28 │
│ 2025-03-10 │        1680 │                 3.36 │
│ 2025-03-11 │         526 │                 1.05 │
│ 2025-03-12 │         814 │                 1.63 │
│ 2025-03-13 │        1453 │                 2.91 │
│ 2025-03-16 │         291 │                 0.58 │
│ 2025-03-17 │         969 │                 1.94 │
│ 2025-03-18 │         713 │                 1.43 │
│ 2025-03-19 │        1891 │                 3.78 │
│ 2025-03-20 │        1564 │                 3.13 │
│ 2025-03-21 │        1363 │                 2.73 │
│ 2025-03-22 │        3214 │                 6.43 │
│ 2025-03-23 │        5671 │                11.34 │
│ 2025-03-24 │        2005 │                 4.01 │
│ 2025-03-25 │        3357 │                 6.71 │
│ 2025-03-26 │        2165 │                 4.33 │
│ 2025-03-27 │        9052 │                18.10 │
│ 2025-03-28 │       12408 │                24.82 │
└────────────┴─────────────┴──────────────────────┘

📊 RCD Notes: Speech size per day

📊 RCD Notes: Speech size per day:
\pset border 2
\pset linestyle unicode

SELECT 
    speech_datecreated::date AS date,
    SUM(length(speech_text)) AS total_text_size_in_chars,
    ROUND(SUM(length(speech_text))::numeric / 1024, 2) AS total_text_size_in_kb
FROM 
    public.speech
GROUP BY 
    speech_datecreated::date
ORDER BY 
    speech_datecreated::date;

Output

┌────────────┬──────────────────────────┬───────────────────────┐
│    date    │ total_text_size_in_chars │ total_text_size_in_kb │
├────────────┼──────────────────────────┼───────────────────────┤
│ 2025-03-05 │                    10103 │                  9.87 │
│ 2025-03-06 │                    12568 │                 12.27 │
│ 2025-03-07 │                     4355 │                  4.25 │
│ 2025-03-08 │                     1474 │                  1.44 │
│ 2025-03-09 │                     5847 │                  5.71 │
│ 2025-03-10 │                     8917 │                  8.71 │
│ 2025-03-11 │                     2811 │                  2.75 │
│ 2025-03-12 │                     4117 │                  4.02 │
│ 2025-03-13 │                     7455 │                  7.28 │
│ 2025-03-16 │                     1554 │                  1.52 │
│ 2025-03-17 │                     5036 │                  4.92 │
│ 2025-03-18 │                     3917 │                  3.83 │
│ 2025-03-19 │                     9420 │                  9.20 │
│ 2025-03-20 │                     8183 │                  7.99 │
│ 2025-03-21 │                     6819 │                  6.66 │
│ 2025-03-22 │                    16401 │                 16.02 │
│ 2025-03-23 │                    28514 │                 27.85 │
│ 2025-03-24 │                     9844 │                  9.61 │
│ 2025-03-25 │                    16094 │                 15.72 │
│ 2025-03-26 │                    10238 │                 10.00 │
│ 2025-03-27 │                    44336 │                 43.30 │
│ 2025-03-28 │                    63773 │                 62.28 │
└────────────┴──────────────────────────┴───────────────────────┘

Shell snippet to check spoken commands by embeddings

transcript=$(<"${temp_file}")

# Get the embedding and find the closest match
embedding=$(echo "$transcript" | rcd-llm-get-embeddings.sh)
if [ -z "$embedding" ]; then
    log "ERROR" "Failed to generate embedding"
    exit 1
fi

# Properly escape for SQL (PostgreSQL dollar-quoting)
escaped_embedding="\$VECTOR\$${embedding}\$VECTOR\$"

# Query for matches
speech_id=$(psql -t <<EOF
    SELECT speech_id
    FROM speech
    WHERE speech_embeddings IS NOT NULL
      AND speech_speechtypes = 13
      AND speech_embeddings <=> ${escaped_embedding}::vector < 0.3
    ORDER BY speech_embeddings <=> ${escaped_embedding}::vector
    LIMIT 1
EOF
     )

# Clean and validate the result
speech_id=$(echo "$speech_id" | tr -d '[:space:]')
if [[ -n "$speech_id" && "$speech_id" =~ ^[0-9]+$ ]]; then
    log "INFO" "Match found (speech_id: $speech_id). Aborting."
    exit 1
fi

Use case

When speaking, sometimes I make incoherent expressions, but then I wish to stop the process. In my case I just say "Aria, please stop recording", and even if I was speaking other things together, it gets recognized and speech transcription stops.

To understand visually how speech and transcription works, see this video:
https://www.youtube.com/watch?v=51jEUtjrARo

I serve many people in an executive role; therefore, you might consider that SQL snippets are beneficial worldwide. 🖥️ Thank you once again! 😊🙏

John6666

23 days ago

For example, if it was a generative model for images or audio, it would be possible to post the generated output and provide feedback, but it's difficult to provide feedback for LMs, LLMs, and VLMs...

The weights are open, but it's not like a human can directly decipher the meaning of the differences in the weights. Unless the model can't be loaded or the output is strange, it's not easy to think of reporting it.
It's hard to report on problems that aren't really serious...

Evaluation is built into Hugging Face itself, and there are also leaderboards. As a user, I can only think of pressing the like button or not, or adding to the collection...
Well, it doesn't seem like there's an easy solution. It's often a problem with proprietary content that it's difficult for feedback to reach the creator, and the creator feels lonely.🤗

ruliana22

23 days ago

good

Join the conversation

📊 RCD Notes: Daily Average Spoken Pages

Output

📊 RCD Notes: Speech size per day

Output

Shell snippet to check spoken commands by embeddings

Use case