# Summarize Youtube videos into text

### Installing necessary libraries


In [None]:
%pip install git+https://github.com/openai/whisper.git

In [None]:
%pip install pytube
%pip install openai
%pip install numpy

### Downloading audio clip from the YouTube video

In [3]:
from pytube import YouTube
import whisper
import os
import subprocess

In [4]:
def download_youtube_audio(url, destination):
    # Create a YouTube object
    yt = YouTube(url)

    # Select the audio stream
    audio_stream = yt.streams.filter(only_audio=True).first()

    # Download the audio stream
    out_file = audio_stream.download(output_path=destination)

    # Set up new filename
    base, ext = os.path.splitext(out_file)
    new_file = base + '.mp3'

    # Convert file to mp3
    subprocess.run(['ffmpeg', '-i', out_file, new_file])

    # Remove the original file
    os.remove(out_file)

    print(f"Downloaded and converted to MP3: {new_file}")
    return new_file

### Input Youtube link

In [72]:
url = input("Enter the YouTube URL: ")

#url = 'https://www.youtube.com/watch?v=reUZRyXxUs4' # as test

Enter the YouTube URL: https://youtu.be/4o5hSxvN_-s?si=6ZcSvt69baVOjYBn


In [73]:
# Set the destination path for the download
destination = "audiofiles/"

file_path = download_youtube_audio(url, destination)

Downloaded and converted to MP3: /content/audiofiles/This is what happens when you reply to spam email l TED.mp3


### Converting the audio into text using **Whisper 1**

In [74]:
def write_text_to_file(text, filename="transcribed_text.txt"):
  # Write the text to the file
  with open(filename, "w") as file:
      file.write(text)

In [75]:
import whisper

model = whisper.load_model("base")
result = model.transcribe(file_path)
print(result["text"])

transcription = result["text"]

 I Three years ago I got one of those spam emails and It managed to get through my spam filter not quite sure how but it's telling me my inbox and it was from a guy called Solomon Odonka I know And one like this said hello James Veech. I have an interesting business proposal. I want to share with you Solomon Now my hand was kind of hovering on the delete button. I see why I was looking at my phone. I thought I could just delete this Or I could do what I think we've all Always wanted to do And I said Solomon your email intrigues me And the game was a foot He said dear James Veech we shall be shipping gold to you You will earn 10% of any gold you distribute So I knew I was dealing with a professional I said how much is it worth He said we will start with a smaller quantity. I was like oh, and then he said of 25 kilograms The worst should be about 2.5 million I just saw him and if you're gonna do it, let's go big I can handle it How much gold do you have He said it's not a matter of how m

In [76]:
# The response object will contain the transcription
write_text_to_file(transcription)

### Converting transcript into summarized text

In [77]:
preprompt = 'You are a model that receives a transcription of a YouTube video. Your task is to correct any words that may be incorrect based on the context, and transform it into a well-structured summary of the entire video. Your summary should highlight important details and provide additional context when necessary. Aim to be detailed, particularly when addressing non-trivial aspects of the content. The summary should encompass at least 20-30% of the original text length.'
prompt = preprompt + transcription

In [78]:
from openai import OpenAI
from google.colab import userdata

client = OpenAI(api_key=userdata.get('OPENAI_API_KEY'))

response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": preprompt},
    {"role": "user", "content": prompt},
  ]
)

# The 'response' will contain the completion from the model
print(response.choices[0].message.content)
write_text_to_file(response.choices[0].message.content, filename='summary.txt')

Three years ago, the creator received a spam email from someone named Solomon Odonka, offering a business proposal involving shipping gold with a 10% distribution earnings. Initially skeptical, the creator decided to engage with Solomon out of curiosity. Their conversation escalated from discussing smaller quantities of gold to trial shipments of 50 kilograms, with Solomon expressing the intent to conduct larger transactions. The creator, posing as a hedge fund executive, continued the correspondence, even creating elaborate codes for security. The interaction with Solomon led the creator to reflect on the consequences of scamming vulnerable individuals and the value of wasting scammers' time. The creator recommended using a pseudonymous email address to avoid inundation with spam emails. They humorously shared experiences of engaging with various scam emails, including one claiming to be from Winnie Mandela seeking assistance in transferring funds. Despite the absurdity of the interac