Add more information to the README
This commit is contained in:
parent
d610db1dd4
commit
4ab43594de
1 changed files with 15 additions and 3 deletions
16
README.md
16
README.md
|
@ -7,7 +7,8 @@ An audio summarizer that glues together ffmpeg, whisper.cpp and BART.
|
||||||
- Python 3 (tested: 3.12)
|
- Python 3 (tested: 3.12)
|
||||||
- ffmpeg
|
- ffmpeg
|
||||||
- git
|
- git
|
||||||
- make & c/c++ compiler
|
- make
|
||||||
|
- c/c++ compiler (on Ubuntu, installing `build-essential` does the trick)
|
||||||
|
|
||||||
## Setup
|
## Setup
|
||||||
|
|
||||||
|
@ -33,7 +34,7 @@ Run setup.sh
|
||||||
### Usage
|
### Usage
|
||||||
|
|
||||||
```
|
```
|
||||||
audio-summarize.py -m filepath -i filepath -o filepath
|
./audio-summarize.py -m filepath -i filepath -o filepath
|
||||||
[--summin n] [--summax n] [--segmax n]
|
[--summin n] [--summax n] [--segmax n]
|
||||||
|
|
||||||
options:
|
options:
|
||||||
|
@ -51,3 +52,14 @@ Example:
|
||||||
```bash
|
```bash
|
||||||
./audio-summarize.py -m ./tmp/whisper_ggml-small.en-q5_1.bin -i ./tmp/test.webm -o ./tmp/output.txt
|
./audio-summarize.py -m ./tmp/whisper_ggml-small.en-q5_1.bin -i ./tmp/test.webm -o ./tmp/output.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## How does it work?
|
||||||
|
|
||||||
|
To summarize a media file, the program executes the following steps:
|
||||||
|
|
||||||
|
1. Convert the media file with [ffmpeg](https://www.ffmpeg.org/) to a mono 16kHz 16bit-PCM wav file
|
||||||
|
2. Transcribe that wav file using [whisper.cpp](https://github.com/ggerganov/whisper.cpp)
|
||||||
|
3. Clean up the transcript (newlines, whitespaces at the beginning and end)
|
||||||
|
4. Semantically split up the transcript into segments using [semantic-text-splitter](https://github.com/benbrandt/text-splitter) and the tokenizer for BART
|
||||||
|
5. Summarize each segment using BART ([`facebook/bart-large-cnn`](https://huggingface.co/facebook/bart-large-cnn))
|
||||||
|
6. Write the results to a text file
|
||||||
|
|
Reference in a new issue