audio-summarize/README.md

# audio-summarize

An audio summarizer that glues together [faster-whisper](https://github.com/SYSTRAN/faster-whisper) and [BART](https://huggingface.co/facebook/bart-large-cnn).

## Supported Languages

Only English summarization is supported.

## Dependencies

- Python 3 (tested: 3.12)

## Setup

Create a virtual environment for python, activate it and install the required python packages:

```bash
python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt
```

## Run

1. In your terminal, make shure you have your python venv activated
2. Run audio-summarize.py

### Usage

```
./audio-summarize.py -i filepath -o filepath [-m name]
                   [--summin n] [--summax n] [--segmax n]

options:
  -h, --help   show this help message and exit
  --summin n   The minimum lenght of a segment summary [10] (min: 5)
  --summax n   The maximum lenght of a segment summary [90] (min: 5)
  --segmax n   The maximum number of tokens per segment [375] (5 - 500)
  -m name      The name of the whisper model to be used [small.en]
  -i filepath  The path to the media file
  -o filepath  Where to save the output text to
```

Example:

```bash
./audio-summarize.py -i ./tmp/test.webm -o ./tmp/output.txt
```

## How does it work?

To summarize a media file, the program executes the following steps:

1. Convert and transcribe the media file using [faster-whisper](https://github.com/SYSTRAN/faster-whisper), using [ffmpeg](https://www.ffmpeg.org/) and [ctranslate2](https://github.com/OpenNMT/CTranslate2/) under the hood
2. Semantically split up the transcript into segments using [semantic-text-splitter](https://github.com/benbrandt/text-splitter) and the tokenizer for BART
3. Summarize each segment using BART ([`facebook/bart-large-cnn`](https://huggingface.co/facebook/bart-large-cnn))
4. Write the results to a text file
Initial commit 2024-08-13 20:29:07 +02:00			`# audio-summarize`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00
Switch from whisper.cpp to faster-whisper 2024-08-15 22:20:55 +02:00			`An audio summarizer that glues together [faster-whisper](https://github.com/SYSTRAN/faster-whisper) and [BART](https://huggingface.co/facebook/bart-large-cnn).`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00
Clarify that only english summarization is supported at the moment, pin it in the code 2024-08-16 20:18:47 +02:00			`## Supported Languages`

			`Only English summarization is supported.`

Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00			`## Dependencies`

			`- Python 3 (tested: 3.12)`

			`## Setup`

Switch from whisper.cpp to faster-whisper 2024-08-15 22:20:55 +02:00			`Create a virtual environment for python, activate it and install the required python packages:`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00
			```bash
			`python3 -m venv .venv`
			`source .venv/bin/activate`
Switch from whisper.cpp to faster-whisper 2024-08-15 22:20:55 +02:00			`pip3 install -r requirements.txt`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00			```

			`## Run`

Switch from whisper.cpp to faster-whisper 2024-08-15 22:20:55 +02:00			`1. In your terminal, make shure you have your python venv activated`
			`2. Run audio-summarize.py`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00
			`### Usage`

			```
Clarify that only english summarization is supported at the moment, pin it in the code 2024-08-16 20:18:47 +02:00			`./audio-summarize.py -i filepath -o filepath [-m name]`
			`[--summin n] [--summax n] [--segmax n]`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00
			`options:`
			`-h, --help show this help message and exit`
Clarify that only english summarization is supported at the moment, pin it in the code 2024-08-16 20:18:47 +02:00			`--summin n The minimum lenght of a segment summary [10] (min: 5)`
			`--summax n The maximum lenght of a segment summary [90] (min: 5)`
			`--segmax n The maximum number of tokens per segment [375] (5 - 500)`
			`-m name The name of the whisper model to be used [small.en]`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00			`-i filepath The path to the media file`
			`-o filepath Where to save the output text to`
			```

			`Example:`

			```bash
Switch from whisper.cpp to faster-whisper 2024-08-15 22:20:55 +02:00			`./audio-summarize.py -i ./tmp/test.webm -o ./tmp/output.txt`
Documentation of dependencies, setup and usage in the README 2024-08-13 21:14:52 +02:00			```
Add more information to the README 2024-08-13 21:37:46 +02:00
			`## How does it work?`

			`To summarize a media file, the program executes the following steps:`

Switch from whisper.cpp to faster-whisper 2024-08-15 22:20:55 +02:00			`1. Convert and transcribe the media file using [faster-whisper](https://github.com/SYSTRAN/faster-whisper), using [ffmpeg](https://www.ffmpeg.org/) and [ctranslate2](https://github.com/OpenNMT/CTranslate2/) under the hood`
			`2. Semantically split up the transcript into segments using [semantic-text-splitter](https://github.com/benbrandt/text-splitter) and the tokenizer for BART`
			3. Summarize each segment using BART ([`facebook/bart-large-cnn`](https://huggingface.co/facebook/bart-large-cnn))
			`4. Write the results to a text file`