Compare commits

..

20 commits
v2.1 ... main

Author SHA1 Message Date
74e59a9fd4 Fixed a small typo in the README 2023-05-30 23:12:32 +02:00
2a2241ce08 Redesigned the chat history, renamed the profile names of vicuna-v0 and vicuna-v1.1 and updated the screnshot 2023-05-30 23:10:36 +02:00
f4abe93735 Added a profile for Manticore Chat 2023-05-30 20:54:27 +02:00
faed129586 Added a profile for Vicuna v1.1 2023-05-30 19:13:21 +02:00
abb8054892 Added an empty profile 2023-05-30 18:57:19 +02:00
de194bead6 Improved the vicuna-v0 profile 2023-05-30 18:53:24 +02:00
2a46750ee9 Updated README 2023-05-30 12:10:54 +02:00
ae0058bdee Improved profiles by adding 'separator' field to the profile format, improved vicuna-v0 profile, removed default profile from frontend-server cli, updated README 2023-05-30 10:55:31 +02:00
bd44e45801
Merge pull request #12 from ChaoticByte/dependabot/pip/llama-cpp-python-server--0.1.56
Bump llama-cpp-python[server] from 0.1.54 to 0.1.56
2023-05-30 10:20:58 +02:00
dependabot[bot]
5cfa6a7b0a
Bump llama-cpp-python[server] from 0.1.54 to 0.1.56
Bumps [llama-cpp-python[server]](https://github.com/abetlen/llama-cpp-python) from 0.1.54 to 0.1.56.
- [Release notes](https://github.com/abetlen/llama-cpp-python/releases)
- [Changelog](https://github.com/abetlen/llama-cpp-python/blob/main/CHANGELOG.md)
- [Commits](https://github.com/abetlen/llama-cpp-python/compare/v0.1.54...v0.1.56)

---
updated-dependencies:
- dependency-name: llama-cpp-python[server]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-30 07:53:12 +00:00
8c29a31598 Updated section about memory/disk requirements in the README 2023-05-25 21:26:00 +02:00
345d0cfc5c Moved import of create_app in api-server.py to the top 2023-05-25 21:18:29 +02:00
ea2f59f94e Preserve whitespaces in messages by using pre-wrap, fixes #10 2023-05-25 19:55:57 +02:00
060d522f6c
Merge pull request #7 from ChaoticByte/dependabot/pip/llama-cpp-python-server--0.1.54
Bump llama-cpp-python[server] from 0.1.50 to 0.1.54 (requires re-quantized models using ggml v3)
2023-05-24 21:06:08 +02:00
dependabot[bot]
1718520de9
Bump llama-cpp-python[server] from 0.1.50 to 0.1.54
Bumps [llama-cpp-python[server]](https://github.com/abetlen/llama-cpp-python) from 0.1.50 to 0.1.54.
- [Release notes](https://github.com/abetlen/llama-cpp-python/releases)
- [Commits](https://github.com/abetlen/llama-cpp-python/compare/v0.1.50...v0.1.54)

---
updated-dependencies:
- dependency-name: llama-cpp-python[server]
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-24 18:01:24 +00:00
22ba6239c7 Added a toggle button for the sidebar, implemented a responsive design (fixes #4) and made more minor improvements to the frontend 2023-05-19 00:19:27 +02:00
63706a3c64 Updated the README, including the screenshot 2023-05-18 16:39:20 +02:00
43fbe364fb Added a profile file for the Vicuna model #5 2023-05-18 16:18:24 +02:00
7590c31f89 Made the frontend more flexible to also support other models than just Koala 2023-05-18 15:54:41 +02:00
c3fda61b21 Made the frontend more flexible to also support other models than just Koala 2023-05-18 15:34:34 +02:00
13 changed files with 289 additions and 133 deletions

View file

@ -1,6 +1,6 @@
# Eucalyptus Chat
A frontend for [Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/) running on CPU with [llama.cpp](https://github.com/ggerganov/llama.cpp), using the API server library provided by [llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
A frontend for large language models like [🐨 Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/) or [🦙 Vicuna](https://lmsys.org/blog/2023-03-30-vicuna/) running on CPU with [llama.cpp](https://github.com/ggerganov/llama.cpp), using the API server library provided by [llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
![](misc/screenshot.png)
@ -8,22 +8,33 @@ A frontend for [Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/) running
- Python 3.10
- The pip packages listed in `requirements.txt`
- A Koala model in the ggml format (should be quantized)
- An AI model in the ggml format (should be quantized)
The 7B-Model, `q4_0`-quantized, requires approx. 5 GB of RAM.
For memory and disk requirements for the different models, see [llama.cpp - Memory/Disk Requirements](https://github.com/ggerganov/llama.cpp#memorydisk-requirements)
## Supported Models
- [🐨 Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)
- [🦙 Vicuna v.0](https://lmsys.org/blog/2023-03-30-vicuna/)
- [🦙 Vicuna v.1.1](https://lmsys.org/blog/2023-03-30-vicuna/)
- [🦁 Manticore Chat](https://huggingface.co/openaccess-ai-collective/manticore-13b-chat-pyg)
(see `./profiles/`)
## Usage
To use Eucalyptus locally, start both the API-Server (`api-server.py`) and the Frontend-Server (`frontend-server.py`).
The default URL of the Frontend-Server is http://localhost:8080.
You have to choose the correct profile for the model you use. See [Supported Models](#supported-models) and [Frontend Server CLI Argument](#frontend-server-cli-arguments).
### API Server CLI Arguments
The following command-line arguments are available:
* `-m` or `--model`: Specifies the path to the model file. This is required and must be provided.
* `--host`: Specifies the address to listen on. By default, it listens on localhost.
* `--port`: Specifies the port number to listen on. The default value is 7331.
* `--host`: Specifies the address to listen on. By default, it listens on `localhost`.
* `--port`: Specifies the port number to listen on. The default value is `7331`.
```bash
python3 api-server.py [-h] -m MODEL [--host HOST] [--port PORT]
@ -33,12 +44,13 @@ python3 api-server.py [-h] -m MODEL [--host HOST] [--port PORT]
The following command-line options are available:
* `--host`: Specifies the IP address or hostname to listen on. Defaults to "localhost".
* `--port`: Specifies the port number to listen on. Defaults to 8080.
* `--api`: Specifies the URL of the API server. Defaults to http://localhost:7331.
* `--profile`: Path to the profile file for the model.
* `--host`: Specifies the IP address or hostname to listen on. Defaults to `localhost`.
* `--port`: Specifies the port number to listen on. Defaults to `8080`.
* `--api`: Specifies the URL of the API server. Defaults to `http://localhost:7331`.
```bash
python3 frontend-server.py [-h] [--host HOST] [--port PORT] [--api API]
python3 frontend-server.py [-h] [--profile PROFILE] [--host HOST] [--port PORT] [--api API]
```
## Third-Party Licenses

View file

@ -4,6 +4,8 @@
from argparse import ArgumentParser
from os import environ
from llama_cpp.server.app import create_app
import uvicorn
if __name__ == "__main__":
@ -13,10 +15,7 @@ if __name__ == "__main__":
ap.add_argument("--host", help="Address to listen on (default: localhost)", type=str, default="localhost")
ap.add_argument("--port", help="Port to listen on (default: 7331)", type=int, default=7331)
args = ap.parse_args()
# Set environment variable before importing api server
environ["MODEL"] = args.model
# Import api server
from llama_cpp.server.app import create_app
# Run
app = create_app()
uvicorn.run(app, host=args.host, port=args.port)

View file

@ -2,6 +2,8 @@
# Copyright (c) 2023 Julian Müller (ChaoticByte)
from argparse import ArgumentParser
from json import load
from pathlib import Path
import uvicorn
from frontend.app import app
@ -9,11 +11,31 @@ from frontend.app import app
if __name__ == "__main__":
# CLI
ap = ArgumentParser()
ap.add_argument("--profile", help="Path to a profile file that includes settings for a specific model", type=Path, required=True)
ap.add_argument("--host", help="Address to listen on (default: localhost)", type=str, default="localhost")
ap.add_argument("--port", help="Port to listen on (default: 8080)", type=int, default=8080)
ap.add_argument("--api", help="URL of the API Server (default: 'http://localhost:7331')", type=str, default="http://localhost:7331")
args = ap.parse_args()
# Read profile
with args.profile.open("r") as pf:
profile = load(pf)
# Check profile
assert "name" in profile
assert "conversation_prefix" in profile
assert "user_keyword" in profile
assert "assistant_keyword" in profile
assert "stop_sequences" in profile
# Pass frontend config to the app
app.config.frontend_config = {"api_url": args.api.rstrip("/")}
app.config.frontend_config = {
"api_url": args.api.rstrip("/"),
"profile": {
"name": profile["name"],
"conversation_prefix": profile["conversation_prefix"],
"user_keyword": profile["user_keyword"],
"assistant_keyword": profile["assistant_keyword"],
"separator": profile["separator"],
"stop_sequences": profile["stop_sequences"]
}
}
# Run
uvicorn.run(app, host=args.host, port=args.port)

View file

@ -18,9 +18,20 @@
</button>
</div>
</div>
<div class="sidepanel flex flex-column">
<div class="flex sidebar-container sidebar-hidden" id="sidebar-container">
<button id="sidebar-toggle-open" class="icon-button">
<svg xmlns="http://www.w3.org/2000/svg" height="48" viewBox="0 96 960 960" width="48"><path d="M433 712V440L297 576l136 136ZM180 936q-24.75 0-42.375-17.625T120 876V276q0-24.75 17.625-42.375T180 216h600q24.75 0 42.375 17.625T840 276v600q0 24.75-17.625 42.375T780 936H180Zm393-60V276H180v600h393Z"/></svg>
</button>
<button id="sidebar-toggle-close" class="icon-button hidden">
<svg xmlns="http://www.w3.org/2000/svg" height="48" viewBox="0 96 960 960" width="48"><path d="M297 440v272l136-136-136-136ZM180 936q-24.75 0-42.375-17.625T120 876V276q0-24.75 17.625-42.375T180 216h600q24.75 0 42.375 17.625T840 276v600q0 24.75-17.625 42.375T780 936H180Zm393-60V276H180v600h393Z"/></svg>
</button>
<div class="sidebar flex flex-column">
<div class="max-width">Settings</div>
<div class="settings flex flex-column">
<div class="setting flex">
<div>Assistant</div>
<div id="settings-label-assistant"></div>
</div>
<div class="setting flex">
<div>max_tokens</div>
<div><input type="number" id="settings-max-tokens" min="16" value="100"></div>
@ -59,6 +70,7 @@
</button>
</div>
</div>
</div>
<script src="/ui/main.js"></script>
</body>
</html>

View file

@ -1,27 +1,22 @@
// Copyright (c) 2023 Julian Müller (ChaoticByte)
(() => {
const isMobile = /Android|BlackBerry|iPhone|iPod|Opera Mini/i.test(navigator.userAgent);
// Koala specific keywords
const conversationBeginning = "BEGINNING OF CONVERSATION:";
const userKeyword = " USER: ";
const assistantKeyword = " GPT:";
const koalaStopSequence = "</s>";
// Fetch configuration and initialize Eucalyptus Chat Frontend
// Get frontend config
let frontend_config = null;
fetch("/config")
.then(r => {
fetch("/config")
.then(r => {
return r.json();
})
.then(j => {
frontend_config = j;
});
}).then(frontend_config => {
// Message Context
let conversation = [conversationBeginning];
let conversation = [frontend_config.profile.conversation_prefix];
// Elements - Sidebar
const sidebarOpenButton = document.getElementById("sidebar-toggle-open");
const sidebarCloseButton = document.getElementById("sidebar-toggle-close");
const sidebarContainer = document.getElementById("sidebar-container");
const settingsLabelAssistantNameElement = document.getElementById("settings-label-assistant");
const settingsMaxTokensElement = document.getElementById("settings-max-tokens");
const settingsTemperatureElement = document.getElementById("settings-temperature");
const settingsTopPElement = document.getElementById("settings-top-p");
@ -29,20 +24,22 @@
const settingsRepeatPenaltyElement = document.getElementById("settings-repeat-penalty");
const settingsPresencePenaltyElement = document.getElementById("settings-presence-penalty");
const settingsFrequencyPenaltyElement = document.getElementById("settings-frequency-penalty");
const resetSettingsButtonElement = document.getElementById("reset-settings-btn");
const resetHistoryButtonElement = document.getElementById("reset-history-btn");
const resetSettingsButton = document.getElementById("reset-settings-btn");
const resetHistoryButton = document.getElementById("reset-history-btn");
settingsLabelAssistantNameElement.innerText = frontend_config.profile.name;
// Elements - Main
const messageHistoryContainer = document.getElementById("messages");
const textInputElement = document.getElementById("text-input");
const sendButtonElement = document.getElementById("send-btn");
const sendButton = document.getElementById("send-btn");
// API requests
async function apiCompletion(prompt, settings) {
const bodyData = JSON.stringify({
"prompt": prompt,
"stop": [koalaStopSequence],
"stop": frontend_config.profile.stop_sequences,
"max_tokens": settings.max_tokens,
"temperature": settings.temperature,
"top_p": settings.top_p,
@ -99,33 +96,36 @@
// Chat
const MessageType = {
// Message Roles
const Roles = {
USER: {
name: "User",
class: "message-bg-user"
},
ASSISTANT: {
name: "Koala",
name: frontend_config.profile.name,
class: "message-bg-assistant"
}
}
function addMessage(message, type) {
if (type == MessageType.USER) {
conversation.push(userKeyword + message + assistantKeyword);
function addMessage(message, role) {
if (role == Roles.USER) {
conversation.push(
frontend_config.profile.user_keyword + " "
+ message + frontend_config.profile.separator + frontend_config.profile.assistant_keyword);
}
else { conversation.push(message); }
else { conversation.push(message + frontend_config.profile.separator); }
// UI
let messageTypeElem = document.createElement("div");
messageTypeElem.classList.add("message-type");
messageTypeElem.innerText = type.name;
let messageRoleElem = document.createElement("div");
messageRoleElem.classList.add("message-type");
messageRoleElem.innerText = role.name;
let messageTextElem = document.createElement("div");
messageTextElem.classList.add("message-text");
messageTextElem.innerText = message;
messageTextElem.innerText = message.trim();
let messageElem = document.createElement("div");
messageElem.classList.add("message");
messageElem.classList.add(type.class);
messageElem.appendChild(messageTypeElem);
messageElem.classList.add(role.class);
messageElem.appendChild(messageRoleElem);
messageElem.appendChild(messageTextElem);
messageHistoryContainer.appendChild(messageElem);
messageHistoryContainer.scrollTo(0, messageHistoryContainer.scrollHeight);
@ -146,9 +146,9 @@
settingsRepeatPenaltyElement.disabled = true;
settingsPresencePenaltyElement.disabled = true;
settingsFrequencyPenaltyElement.disabled = true;
resetSettingsButtonElement.disabled = true;
resetHistoryButtonElement.disabled = true;
sendButtonElement.disabled = true;
resetSettingsButton.disabled = true;
resetHistoryButton.disabled = true;
sendButton.disabled = true;
textInputElement.disabled = true;
}
@ -160,19 +160,15 @@
settingsRepeatPenaltyElement.disabled = false;
settingsPresencePenaltyElement.disabled = false;
settingsFrequencyPenaltyElement.disabled = false;
resetSettingsButtonElement.disabled = false;
resetHistoryButtonElement.disabled = false;
sendButtonElement.disabled = false;
resetSettingsButton.disabled = false;
resetHistoryButton.disabled = false;
sendButton.disabled = false;
textInputElement.disabled = false;
// focus text input
textInputElement.focus();
if (!isMobile) textInputElement.focus();
}
async function chat() {
if (frontend_config == null) {
console.log("Couldn't fetch frontend configuration.");
}
else {
disableInput();
let input = textInputElement.value.trim();
if (input == "") {
@ -181,27 +177,44 @@
else {
textInputElement.value = "";
resizeInputElement();
addMessage(input, MessageType.USER);
addMessage(input, Roles.USER);
let prompt = conversation.join("");
let settings = getSettings();
apiCompletion(prompt, settings).then(r => {
addMessage(r, MessageType.ASSISTANT);
addMessage(r.trim(), Roles.ASSISTANT);
enableInput();
});
}
}
}
function resetHistory() {
conversation = [conversationBeginning];
conversation = [frontend_config.profile.conversation_prefix];
messageHistoryContainer.innerText = "";
}
// Sidebar
function toggleSidebar() {
if (sidebarContainer.classList.contains("sidebar-hidden")) {
sidebarCloseButton.classList.remove("hidden");
sidebarOpenButton.classList.add("hidden");
sidebarContainer.classList.remove("sidebar-hidden");
}
else {
sidebarOpenButton.classList.remove("hidden");
sidebarCloseButton.classList.add("hidden");
sidebarContainer.classList.add("sidebar-hidden");
}
}
// Event Listeners
resetSettingsButtonElement.addEventListener("click", resetSettings);
resetHistoryButtonElement.addEventListener("click", resetHistory);
sendButtonElement.addEventListener("click", chat);
sidebarOpenButton.addEventListener("click", toggleSidebar);
sidebarCloseButton.addEventListener("click", toggleSidebar);
resetSettingsButton.addEventListener("click", resetSettings);
resetHistoryButton.addEventListener("click", resetHistory);
sendButton.addEventListener("click", chat);
textInputElement.addEventListener("keypress", e => {
// Send via Ctrl+Enter
@ -213,4 +226,4 @@
textInputElement.addEventListener("input", resizeInputElement);
resizeInputElement();
})();
});

View file

@ -5,11 +5,13 @@
--background2: #303030;
--background3: #161616;
--background4: #131313;
--background5: #1a1a1a;
--button-bg: #3b3b3b;
--button-bg2: #4f4f4f;
--icon-button-fill: #ffffff;
--send-icon-button-fill: #29c76d;
--color: #fafafa;
--color2: #bbbbbb;
--border-radius: .5rem;
}
@ -27,11 +29,23 @@ input[type="number"] {
width: 4rem;
}
.sidepanel {
.sidebar-container {
padding: .5rem;
height: 100%;
background-color: var(--background5);
box-sizing: border-box;
}
.sidebar-container.sidebar-hidden > .sidebar {
display: none;
}
.sidebar {
margin-left: .5rem;
margin-right: .5rem;
margin-top: .4rem;
gap: .5rem;
align-items: flex-end;
padding: 1rem;
padding-left: 0;
min-width: fit-content;
}
@ -56,19 +70,38 @@ input[type="number"] {
}
.messages {
gap: 1.1rem;
gap: 1rem;
margin-bottom: 1rem;
overflow-y: scroll;
max-height: 89vh;
align-items: center;
flex-grow: 2;
}
.message {
display: flex;
flex-direction: row;
gap: .5rem;
padding: .5rem;
flex-direction: column;
flex-wrap: wrap;
gap: 1rem;
}
.message-type {
color: var(--color2);
text-align: center;
}
.message-text {
white-space: pre-wrap;
padding: .5rem .8rem;
border-radius: var(--border-radius);
max-width: fit-content;
}
.message-bg-assistant > .message-text {
background: var(--background2);
}
.message-bg-user > .message-text {
background: var(--background3);
}
button {
@ -101,19 +134,6 @@ button:hover {
width: 100%;
}
.message-bg-assistant {
background: var(--background2);
}
.message-bg-user {
background: var(--background3);
}
.message-type {
min-width: 3.5rem;
padding-left: .1rem;
}
.input-container {
margin-top: auto;
flex-direction: row;
@ -135,6 +155,8 @@ button:hover {
}
.icon-button {
width: fit-content;
height: fit-content;
padding: .2rem;
display: flex;
align-items: center;
@ -160,3 +182,39 @@ button:hover {
height: 2.2rem;
fill: var(--send-icon-button-fill);
}
.hidden {
display: none;
}
@media only screen and (max-width: 600px) {
body {
height: 100dvh;
}
.sidebar-container {
position: absolute;
width: 100%;
height: 100%;
z-index: 1;
}
.sidebar-container.sidebar-hidden {
position: absolute;
width: unset;
height: unset;
left: auto;
right: 0;
background: transparent;
}
.sidebar-container > #sidebar-toggle-close {
margin-right: auto;
}
.sidebar {
margin-right: auto;
padding-right: 2.5rem;
}
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

After

Width:  |  Height:  |  Size: 129 KiB

Before After
Before After

8
profiles/empty.json Normal file
View file

@ -0,0 +1,8 @@
{
"name": "None",
"conversation_prefix": "",
"user_keyword": "",
"assistant_keyword": "",
"separator": "",
"stop_sequences": []
}

8
profiles/koala.json Normal file
View file

@ -0,0 +1,8 @@
{
"name": "Koala",
"conversation_prefix": "BEGINNING OF CONVERSATION: ",
"user_keyword": "USER:",
"assistant_keyword": "GPT:",
"separator": " ",
"stop_sequences": ["</s>"]
}

View file

@ -0,0 +1,8 @@
{
"name": "Manticore",
"conversation_prefix": "",
"user_keyword": "USER:",
"assistant_keyword": "ASSISTANT:",
"separator": "\n",
"stop_sequences": ["</s>", "<unk>", "### USER:", "USER:"]
}

8
profiles/vicuna-v0.json Normal file
View file

@ -0,0 +1,8 @@
{
"name": "Vicuna v0",
"conversation_prefix": "A chat between a curious human and a helpful AI assistant.\n\n",
"user_keyword": "### Human:",
"assistant_keyword": "### Assistant:",
"separator": "\n",
"stop_sequences": ["### Human:"]
}

View file

@ -0,0 +1,8 @@
{
"name": "Vicuna v1.1",
"conversation_prefix": "A chat between a curious user and a helpful AI assistant.\n\n",
"user_keyword": "USER:",
"assistant_keyword": "ASSISTANT:",
"separator": "\n",
"stop_sequences": ["</s>"]
}

View file

@ -1,3 +1,3 @@
llama-cpp-python[server]==0.1.50
llama-cpp-python[server]==0.1.56
uvicorn==0.22.0
sanic==23.3.0