The transcripts of spring 2023 seminar of IOB at UGA auto generated by the OpenAI Whisper large model.
Description
All the transcripts with different file format of IOB Spring seminars as of April 2023 are in the out
folder of the github repo.
The scripts used to generate the transcripts are in the scripts
folder. I first used the yt_dlp.sh
to download youtube videos and convert them into mp3
format. (ty yt-dlp)
yt-dlp -x --audio-format mp3 -o '%(title)s.%(ext)s' {youtube link}
Then use the prep_whisper_job.py
to generate commands for each sound file. It would generate cmd.sh
for me to submit a lot of job to run whisper on UGA’s sapelo2 cluster.
whisper --model large -o out -- './{filename}';
Finally, I used prep_jekyll_page.py
to generate the markdown file for each transcript so we can see this github page.
Status
I will update when I want to. Please feel free to use the transcripts for your own purpose or contact me for more interesting projects.
links: https://y.at/💻🌲🎓🚀🌕
Posts
Sp23-Dr. Yang-Epidemiology, Ecology and Transmission of Middle East Respiratory Syndrome Coronavirus
Sp23 Dr. Robert Edgar - Computational biology, bioinformatics
Sp23 Dr. Japheth Gado - NREL - Machiene Learning and Enzyme Engineering
Sp23 - Dr. Sam Shepard - A “Day” in the Life of Virus Surveillance
Sp23 - Dr. Josh Starmer - The Quest of StatQuest!
Sp23 - Dr. Casey Greene - Making Serendipity Routine
subscribe via RSS