mirror of
https://github.com/ianarawjo/ChainForge.git
synced 2025-03-14 08:16:37 +00:00
Update 0.2.5: Add Chat Turns, LLM Scorers (#109)
* Add LLM scorer node (#107) * Modularize the LLM list container, extracting it from prompt node * Working LLM scorer node * Bug and minor fixes * Change modals to use perc left. * Add inspect response footer to LLMEvalNode. * Make Play buttons light green * Fix React errors w keys in JSX arrays * Add Chat Turn node and support for chat history (#108) * Adds chat_history across backend's cache and querying mechanisms. * Adds Chat Turn nodes, which allow for continuing a conversation. * Adds automatic conversions of ChatHistory (in OpenAI format) to Anthropic and Google PaLM's chat formats. Converts chat history to appropriate format and passes it as context in the API call. * Bug fix and error popup when missing past convo in Chat Turn * Bug squashing to progress in chat turn node * bug squashing * Color false scores bright red in eval inspector * fix tooltip when cont chat present * Rebuild react * bug fix llm eval node * Add HF chat model support. * Show multiple response objs in table inspector view * Fix LLM item deletion bug * Rebuild react and update package version * Fix obscure bug when LLM outputs have no 'llm' property (due to prior CF version) * Fix isLooselyEqual bug * Update examples so that their cached 'fields' include llm nicknames * rebuild react * Add Chelse to readme
This commit is contained in:
parent
a27d092ccc
commit
f7fab1d76d
@ -90,7 +90,7 @@ For more specific details, see the [User Guide](https://github.com/ianarawjo/Cha
|
||||
|
||||
# Development
|
||||
|
||||
ChainForge was created by [Ian Arawjo](http://ianarawjo.com/index.html), a postdoctoral scholar in Harvard HCI's [Glassman Lab](http://glassmanlab.seas.harvard.edu/) with support from the Harvard HCI community, especially PhD student [Priyan Vaithilingam](https://priyan.info).
|
||||
ChainForge was created by [Ian Arawjo](http://ianarawjo.com/index.html), a postdoctoral scholar in Harvard HCI's [Glassman Lab](http://glassmanlab.seas.harvard.edu/) with support from the Harvard HCI community. Collaborators include PhD students [Priyan Vaithilingam](https://priyan.info) and [Chelse Swoopes](https://seas.harvard.edu/person/chelse-swoopes) and faculty members [Elena Glassman](http://glassmanlab.seas.harvard.edu/glassman.html) and [Martin Wattenberg](https://www.bewitched.com/about.html).
|
||||
|
||||
This work was partially funded by the NSF grant IIS-2107391. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
|
||||
|
||||
@ -118,7 +118,7 @@ The BibTeX you can use for now is:
|
||||
|
||||
```bibtex
|
||||
@misc{Arawjo_2023,
|
||||
author = {Arawjo, Ian and Vaithilingam, Priyan and Wattenberg, Martin and Glassman, Elena},
|
||||
author = {Arawjo, Ian and Vaithilingam, Priyan and Swoopes, Chelse and Wattenberg, Martin and Glassman, Elena},
|
||||
title = {ChainForge},
|
||||
year = {2023},
|
||||
howpublished = {\url{https://www.chainforge.ai/}},
|
||||
|
@ -3,221 +3,701 @@
|
||||
"nodes": [
|
||||
{
|
||||
"width": 312,
|
||||
"height": 393,
|
||||
"height": 382,
|
||||
"data": {
|
||||
"fields": [
|
||||
{
|
||||
"text": "The first Pokemon game was released in Japan in 1996 for the Game Boy and was called \"Pocket Monsters Red\" and \"Pocket Monsters Green.\" The English version, \"Pokemon Red\" and \"Pokemon Blue,\" was released in 1998 for the Game Boy in North America.",
|
||||
"prompt": "What was the first Pokemon game?",
|
||||
"fill_history": {
|
||||
"game": "Pokemon"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Pokemon game was called \"Pocket Monsters Red\" and \"Pocket Monsters Green\" and was released in Japan in 1996 for the Game Boy console. The first games released in North America were \"Pokemon Red\" and \"Pokemon Blue\" in 1998.",
|
||||
"prompt": "What was the first Pokemon game?",
|
||||
"fill_history": {
|
||||
"game": "Pokemon"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Pokemon game was Pokemon Red and Green, which were released in Japan for the Game Boy in 1996. The first Pokemon game released outside of Japan was Pokemon Red and Blue, which were released for the Game Boy in North America in 1998.",
|
||||
"prompt": "What was the first Pokemon game?",
|
||||
"fill_history": {
|
||||
"game": "Pokemon"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Legend of Zelda game was released in 1986 for the Nintendo Entertainment System (NES) console. It was simply titled \"The Legend of Zelda\" and was developed and published by Nintendo.",
|
||||
"prompt": "What was the first Legend of Zelda game?",
|
||||
"fill_history": {
|
||||
"game": "Legend of Zelda"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Legend of Zelda game was released in 1986 for the Nintendo Entertainment System (NES). It was simply titled \"The Legend of Zelda.\"",
|
||||
"prompt": "What was the first Legend of Zelda game?",
|
||||
"fill_history": {
|
||||
"game": "Legend of Zelda"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Legend of Zelda game was released in 1986 for the Nintendo Entertainment System (NES). It was called \"The Legend of Zelda\" and was created by Nintendo's legendary game designer, Shigeru Miyamoto.",
|
||||
"prompt": "What was the first Legend of Zelda game?",
|
||||
"fill_history": {
|
||||
"game": "Legend of Zelda"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Kirby game is called Kirby's Dream Land, and it was released for the Game Boy in 1992.",
|
||||
"prompt": "What was the first Kirby game?",
|
||||
"fill_history": {
|
||||
"game": "Kirby"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Kirby game is called Kirby's Dream Land. It was released in 1992 for the Game Boy console.",
|
||||
"prompt": "What was the first Kirby game?",
|
||||
"fill_history": {
|
||||
"game": "Kirby"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Kirby game was called \"Kirby's Dream Land,\" and it was released by Nintendo for the Game Boy in 1992.",
|
||||
"prompt": "What was the first Kirby game?",
|
||||
"fill_history": {
|
||||
"game": "Kirby"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Super Mario Bros game was titled \"Super Mario Bros.\" It was released for the Nintendo Entertainment System (NES) in 1985.",
|
||||
"prompt": "What was the first Super Mario Bros game?",
|
||||
"fill_history": {
|
||||
"game": "Super Mario Bros"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Super Mario Bros game was released in 1985 for the Nintendo Entertainment System (NES). It is a platform video game developed and published by Nintendo.",
|
||||
"prompt": "What was the first Super Mario Bros game?",
|
||||
"fill_history": {
|
||||
"game": "Super Mario Bros"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Super Mario Bros game was simply titled \"Super Mario Bros.\" It was released for the Nintendo Entertainment System (NES) in Japan in 1985 and became a worldwide phenomenon.",
|
||||
"prompt": "What was the first Super Mario Bros game?",
|
||||
"fill_history": {
|
||||
"game": "Super Mario Bros"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"formData": {
|
||||
"shortname": "GPT3.5"
|
||||
},
|
||||
"key": "708e22ff-a0ac-4860-9455-eaa84fc614bf",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT3.5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Pokémon games were Pokémon Red and Pokémon Green, released in Japan in 1996 for the Game Boy. Pokémon Red and Blue were later released in North America in 1998.",
|
||||
"prompt": "What was the first Pokemon game?",
|
||||
"fill_history": {
|
||||
"game": "Pokemon"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Pokémon games were Pokémon Red and Pokémon Green, developed by Game Freak and published by Nintendo. They were released in Japan in 1996 for the Game Boy. Pokémon Red and Green later inspired the international Pokémon Red and Blue versions that were released in 1998.",
|
||||
"prompt": "What was the first Pokemon game?",
|
||||
"fill_history": {
|
||||
"game": "Pokemon"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Pokémon games were Pokémon Red and Pokémon Green, released in Japan on February 27, 1996, for the Game Boy. These games were later released as Pokémon Red and Pokémon Blue in North America and other regions.",
|
||||
"prompt": "What was the first Pokemon game?",
|
||||
"fill_history": {
|
||||
"game": "Pokemon"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Legend of Zelda game, titled \"The Legend of Zelda,\" was released in Japan on February 21, 1986, for the Nintendo Entertainment System (NES) and was later released in North America on August 22, 1987. The game was developed and published by Nintendo and designed by Shigeru Miyamoto and Takashi Tezuka.",
|
||||
"prompt": "What was the first Legend of Zelda game?",
|
||||
"fill_history": {
|
||||
"game": "Legend of Zelda"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Legend of Zelda game, titled \"The Legend of Zelda,\" was released in 1986 for the Nintendo Entertainment System (NES). It was developed and published by Nintendo and created by Shigeru Miyamoto and Takashi Tezuka.",
|
||||
"prompt": "What was the first Legend of Zelda game?",
|
||||
"fill_history": {
|
||||
"game": "Legend of Zelda"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Legend of Zelda game, titled \"The Legend of Zelda,\" was released on February 21, 1986, for the Nintendo Entertainment System (NES) in Japan. It was developed and published by Nintendo and created by Shigeru Miyamoto and Takashi Tezuka.",
|
||||
"prompt": "What was the first Legend of Zelda game?",
|
||||
"fill_history": {
|
||||
"game": "Legend of Zelda"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Kirby game was \"Kirby's Dream Land,\" released on the Nintendo Game Boy in 1992. It introduced the character Kirby and was developed by HAL Laboratory and published by Nintendo.",
|
||||
"prompt": "What was the first Kirby game?",
|
||||
"fill_history": {
|
||||
"game": "Kirby"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Kirby game was \"Kirby's Dream Land,\" which was released on April 27, 1992, for the Nintendo Game Boy.",
|
||||
"prompt": "What was the first Kirby game?",
|
||||
"fill_history": {
|
||||
"game": "Kirby"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Kirby game is titled \"Kirby's Dream Land,\" which was released for the Nintendo Game Boy in 1992. The character Kirby was created by Masahiro Sakurai, and the game was developed by HAL Laboratory.",
|
||||
"prompt": "What was the first Kirby game?",
|
||||
"fill_history": {
|
||||
"game": "Kirby"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Super Mario Bros game, titled \"Super Mario Bros.,\" was released in 1985 for the Nintendo Entertainment System (NES). It was created by Shigeru Miyamoto and Takashi Tezuka and developed by Nintendo. The game followed the adventures of Mario (and his brother Luigi in multiplayer mode) as they tried to rescue Princess Toadstool from the villainous Bowser.",
|
||||
"prompt": "What was the first Super Mario Bros game?",
|
||||
"fill_history": {
|
||||
"game": "Super Mario Bros"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Super Mario Bros. game was released on September 13, 1985, for the Nintendo Entertainment System (NES). It was developed by Nintendo and created by Shigeru Miyamoto and Takashi Tezuka. This iconic game introduced the world to Mario and his brother Luigi as they navigate through the Mushroom Kingdom to rescue Princess Peach from the antagonist Bowser.",
|
||||
"prompt": "What was the first Super Mario Bros game?",
|
||||
"fill_history": {
|
||||
"game": "Super Mario Bros"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
},
|
||||
{
|
||||
"text": "The first Super Mario Bros. game was released in 1985 for the Nintendo Entertainment System (NES). It was developed and published by Nintendo and created by Shigeru Miyamoto and Takashi Tezuka. The game marked the debut of the now-iconic characters Mario and Luigi, and it is considered one of the most influential and important video games in history.",
|
||||
"prompt": "What was the first Super Mario Bros game?",
|
||||
"fill_history": {
|
||||
"game": "Super Mario Bros"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "4bb76fa3-2e3d-4a98-b8ca-5f05f47cc8e5",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"LLM_0": "GPT4"
|
||||
}
|
||||
@ -279,8 +759,8 @@
|
||||
"type": "prompt"
|
||||
},
|
||||
{
|
||||
"width": 328,
|
||||
"height": 248,
|
||||
"width": 332,
|
||||
"height": 283,
|
||||
"data": {
|
||||
"fields": {
|
||||
"f0": "Pokemon",
|
||||
@ -304,17 +784,26 @@
|
||||
},
|
||||
{
|
||||
"width": 722,
|
||||
"height": 546,
|
||||
"height": 557,
|
||||
"data": {
|
||||
"input": "evalNode-1688054984836",
|
||||
"refresh": false,
|
||||
"selected_vars": [
|
||||
"game"
|
||||
],
|
||||
"selected_vars": [],
|
||||
"vars": [
|
||||
"LLM (default)",
|
||||
{
|
||||
"label": "game",
|
||||
"value": "game"
|
||||
"value": "game",
|
||||
"label": "game"
|
||||
}
|
||||
],
|
||||
"llm_groups": [
|
||||
{
|
||||
"value": "LLM",
|
||||
"label": "LLMs (last)"
|
||||
},
|
||||
{
|
||||
"value": "LLM_0",
|
||||
"label": "LLMs #1"
|
||||
}
|
||||
]
|
||||
},
|
||||
@ -332,8 +821,8 @@
|
||||
"type": "vis"
|
||||
},
|
||||
{
|
||||
"width": 320,
|
||||
"height": 386,
|
||||
"width": 302,
|
||||
"height": 260,
|
||||
"data": {
|
||||
"input": "prompt-1684348622333",
|
||||
"refresh": false
|
||||
@ -354,22 +843,23 @@
|
||||
{
|
||||
"width": 359,
|
||||
"height": 221,
|
||||
"id": "evalNode-1688054984836",
|
||||
"type": "evaluator",
|
||||
"data": {
|
||||
"code": "function evaluate(response) {\n return response.text.length;\n}",
|
||||
"language": "javascript",
|
||||
"code": "function evaluate(response) {\n return response.text.length;\n}"
|
||||
"refresh": false
|
||||
},
|
||||
"dragging": false,
|
||||
"id": "evalNode-1688054984836",
|
||||
"position": {
|
||||
"x": 784,
|
||||
"y": 208
|
||||
},
|
||||
"selected": true,
|
||||
"positionAbsolute": {
|
||||
"x": 784,
|
||||
"y": 208
|
||||
},
|
||||
"dragging": false
|
||||
"selected": true,
|
||||
"type": "evaluator"
|
||||
}
|
||||
],
|
||||
"edges": [
|
||||
@ -429,30 +919,30 @@
|
||||
"targetHandle": "game"
|
||||
},
|
||||
{
|
||||
"id": "reactflow__edge-evalNode-1688054984836output-vis-1684348622333input",
|
||||
"interactionWidth": 100,
|
||||
"markerEnd": {
|
||||
"height": "22px",
|
||||
"type": "arrow",
|
||||
"width": "22px"
|
||||
},
|
||||
"source": "evalNode-1688054984836",
|
||||
"sourceHandle": "output",
|
||||
"target": "vis-1684348622333",
|
||||
"targetHandle": "input",
|
||||
"interactionWidth": 100,
|
||||
"markerEnd": {
|
||||
"type": "arrow",
|
||||
"width": "22px",
|
||||
"height": "22px"
|
||||
},
|
||||
"id": "reactflow__edge-evalNode-1688054984836output-vis-1684348622333input"
|
||||
"targetHandle": "input"
|
||||
},
|
||||
{
|
||||
"id": "reactflow__edge-prompt-1684348622333prompt-evalNode-1688054984836responseBatch",
|
||||
"interactionWidth": 100,
|
||||
"markerEnd": {
|
||||
"height": "22px",
|
||||
"type": "arrow",
|
||||
"width": "22px"
|
||||
},
|
||||
"source": "prompt-1684348622333",
|
||||
"sourceHandle": "prompt",
|
||||
"target": "evalNode-1688054984836",
|
||||
"targetHandle": "responseBatch",
|
||||
"interactionWidth": 100,
|
||||
"markerEnd": {
|
||||
"type": "arrow",
|
||||
"width": "22px",
|
||||
"height": "22px"
|
||||
},
|
||||
"id": "reactflow__edge-prompt-1684348622333prompt-evalNode-1688054984836responseBatch"
|
||||
"targetHandle": "responseBatch"
|
||||
}
|
||||
],
|
||||
"viewport": {
|
||||
@ -1021,7 +1511,11 @@
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
"temp": 1,
|
||||
"progress": {
|
||||
"success": 0,
|
||||
"error": 0
|
||||
}
|
||||
},
|
||||
"prompt-1684348622333_1.json": {
|
||||
"base_model": "gpt-4",
|
||||
@ -1040,7 +1534,11 @@
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
"temp": 1,
|
||||
"progress": {
|
||||
"success": 0,
|
||||
"error": 0
|
||||
}
|
||||
}
|
||||
},
|
||||
"responses_last_run": [
|
||||
@ -1230,13 +1728,6 @@
|
||||
"total_tokens": 190
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 240,
|
||||
"median": 242,
|
||||
"stdev": 7.211102550927978,
|
||||
"range": [
|
||||
232,
|
||||
246
|
||||
],
|
||||
"items": [
|
||||
246,
|
||||
232,
|
||||
@ -1265,13 +1756,6 @@
|
||||
"total_tokens": 143
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 179,
|
||||
"median": 192,
|
||||
"stdev": 34.39476704383968,
|
||||
"range": [
|
||||
140,
|
||||
205
|
||||
],
|
||||
"items": [
|
||||
192,
|
||||
140,
|
||||
@ -1300,13 +1784,6 @@
|
||||
"total_tokens": 102
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 102.33333333333333,
|
||||
"median": 100,
|
||||
"stdev": 7.767453465154029,
|
||||
"range": [
|
||||
96,
|
||||
111
|
||||
],
|
||||
"items": [
|
||||
96,
|
||||
100,
|
||||
@ -1335,13 +1812,6 @@
|
||||
"total_tokens": 126
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 158,
|
||||
"median": 162,
|
||||
"stdev": 25.238858928247925,
|
||||
"range": [
|
||||
131,
|
||||
181
|
||||
],
|
||||
"items": [
|
||||
131,
|
||||
162,
|
||||
@ -1370,13 +1840,6 @@
|
||||
"total_tokens": 163
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 221.33333333333334,
|
||||
"median": 217,
|
||||
"stdev": 53.631458430042095,
|
||||
"range": [
|
||||
170,
|
||||
277
|
||||
],
|
||||
"items": [
|
||||
170,
|
||||
277,
|
||||
@ -1405,13 +1868,6 @@
|
||||
"total_tokens": 208
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 259.3333333333333,
|
||||
"median": 245,
|
||||
"stdev": 46.198845584422706,
|
||||
"range": [
|
||||
222,
|
||||
311
|
||||
],
|
||||
"items": [
|
||||
311,
|
||||
222,
|
||||
@ -1440,13 +1896,6 @@
|
||||
"total_tokens": 140
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 167.33333333333334,
|
||||
"median": 186,
|
||||
"stdev": 49.702447961175245,
|
||||
"range": [
|
||||
111,
|
||||
205
|
||||
],
|
||||
"items": [
|
||||
186,
|
||||
111,
|
||||
@ -1475,13 +1924,6 @@
|
||||
"total_tokens": 256
|
||||
},
|
||||
"eval_res": {
|
||||
"mean": 359.3333333333333,
|
||||
"median": 362,
|
||||
"stdev": 6.429100507328637,
|
||||
"range": [
|
||||
352,
|
||||
364
|
||||
],
|
||||
"items": [
|
||||
364,
|
||||
362,
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -3,14 +3,31 @@
|
||||
"nodes": [
|
||||
{
|
||||
"width": 312,
|
||||
"height": 435,
|
||||
"height": 423,
|
||||
"data": {
|
||||
"fields": [
|
||||
{
|
||||
"text": "4.",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 2 + 2?",
|
||||
"fill_history": {
|
||||
"question": "What is 2 + 2?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"key": "9fbdcc67-7c5f-4cd5-a2e4-6f2d4a190cd0",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "4",
|
||||
"LLM_0": "GPT3.5"
|
||||
@ -18,9 +35,26 @@
|
||||
},
|
||||
{
|
||||
"text": "3",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is the square root of 9?",
|
||||
"fill_history": {
|
||||
"question": "What is the square root of 9?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"key": "9fbdcc67-7c5f-4cd5-a2e4-6f2d4a190cd0",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "3",
|
||||
"LLM_0": "GPT3.5"
|
||||
@ -28,9 +62,26 @@
|
||||
},
|
||||
{
|
||||
"text": "1997.",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What year was the 50th anniversary of the invention of the transistor?",
|
||||
"fill_history": {
|
||||
"question": "What year was the 50th anniversary of the invention of the transistor?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"key": "9fbdcc67-7c5f-4cd5-a2e4-6f2d4a190cd0",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "1997",
|
||||
"LLM_0": "GPT3.5"
|
||||
@ -38,9 +89,26 @@
|
||||
},
|
||||
{
|
||||
"text": "125",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 5 to the power of 3?",
|
||||
"fill_history": {
|
||||
"question": "What is 5 to the power of 3?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"key": "9fbdcc67-7c5f-4cd5-a2e4-6f2d4a190cd0",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "125",
|
||||
"LLM_0": "GPT3.5"
|
||||
@ -48,9 +116,26 @@
|
||||
},
|
||||
{
|
||||
"text": "54",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 2 multiplied by the cubic root of 729?",
|
||||
"fill_history": {
|
||||
"question": "What is 2 multiplied by the cubic root of 729?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-3.5-turbo",
|
||||
"emoji": "🙂",
|
||||
"key": "9fbdcc67-7c5f-4cd5-a2e4-6f2d4a190cd0",
|
||||
"model": "gpt-3.5-turbo",
|
||||
"name": "GPT3.5",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "18",
|
||||
"LLM_0": "GPT3.5"
|
||||
@ -58,9 +143,29 @@
|
||||
},
|
||||
{
|
||||
"text": "4",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 2 + 2?",
|
||||
"fill_history": {
|
||||
"question": "What is 2 + 2?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "667e0d14-527b-4fe7-afdc-219504729e6d",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "4",
|
||||
"LLM_0": "GPT4"
|
||||
@ -68,9 +173,29 @@
|
||||
},
|
||||
{
|
||||
"text": "3",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is the square root of 9?",
|
||||
"fill_history": {
|
||||
"question": "What is the square root of 9?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "667e0d14-527b-4fe7-afdc-219504729e6d",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "3",
|
||||
"LLM_0": "GPT4"
|
||||
@ -78,9 +203,29 @@
|
||||
},
|
||||
{
|
||||
"text": "1997",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What year was the 50th anniversary of the invention of the transistor?",
|
||||
"fill_history": {
|
||||
"question": "What year was the 50th anniversary of the invention of the transistor?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "667e0d14-527b-4fe7-afdc-219504729e6d",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "1997",
|
||||
"LLM_0": "GPT4"
|
||||
@ -88,9 +233,29 @@
|
||||
},
|
||||
{
|
||||
"text": "125",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 5 to the power of 3?",
|
||||
"fill_history": {
|
||||
"question": "What is 5 to the power of 3?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "667e0d14-527b-4fe7-afdc-219504729e6d",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "125",
|
||||
"LLM_0": "GPT4"
|
||||
@ -98,9 +263,29 @@
|
||||
},
|
||||
{
|
||||
"text": "12",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 2 multiplied by the cubic root of 729?",
|
||||
"fill_history": {
|
||||
"question": "What is 2 multiplied by the cubic root of 729?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "gpt-4",
|
||||
"emoji": "🥵",
|
||||
"formData": {
|
||||
"shortname": "GPT4"
|
||||
},
|
||||
"key": "667e0d14-527b-4fe7-afdc-219504729e6d",
|
||||
"model": "gpt-4",
|
||||
"name": "GPT4",
|
||||
"settings": {
|
||||
"frequency_penalty": 0,
|
||||
"presence_penalty": 0,
|
||||
"stop": [],
|
||||
"system_msg": "You are a helpful assistant.",
|
||||
"temperature": 1,
|
||||
"top_p": 1
|
||||
},
|
||||
"temp": 1
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "18",
|
||||
"LLM_0": "GPT4"
|
||||
@ -108,9 +293,28 @@
|
||||
},
|
||||
{
|
||||
"text": "2+2=4",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 2 + 2?",
|
||||
"fill_history": {
|
||||
"question": "What is 2 + 2?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "palm2-bison",
|
||||
"emoji": "🦬",
|
||||
"formData": {
|
||||
"shortname": "PaLM2"
|
||||
},
|
||||
"key": "9b3f329c-9a8d-4bb9-bf72-12a7bd8ea27a",
|
||||
"model": "chat-bison-001",
|
||||
"name": "PaLM2",
|
||||
"settings": {
|
||||
"max_output_tokens": 512,
|
||||
"stop_sequences": [],
|
||||
"temperature": 0.5,
|
||||
"top_k": -1,
|
||||
"top_p": -1
|
||||
},
|
||||
"temp": 0.7
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "4",
|
||||
"LLM_0": "PaLM2"
|
||||
@ -118,9 +322,28 @@
|
||||
},
|
||||
{
|
||||
"text": "The square root of 9 is 3.",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is the square root of 9?",
|
||||
"fill_history": {
|
||||
"question": "What is the square root of 9?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "palm2-bison",
|
||||
"emoji": "🦬",
|
||||
"formData": {
|
||||
"shortname": "PaLM2"
|
||||
},
|
||||
"key": "9b3f329c-9a8d-4bb9-bf72-12a7bd8ea27a",
|
||||
"model": "chat-bison-001",
|
||||
"name": "PaLM2",
|
||||
"settings": {
|
||||
"max_output_tokens": 512,
|
||||
"stop_sequences": [],
|
||||
"temperature": 0.5,
|
||||
"top_k": -1,
|
||||
"top_p": -1
|
||||
},
|
||||
"temp": 0.7
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "3",
|
||||
"LLM_0": "PaLM2"
|
||||
@ -128,9 +351,28 @@
|
||||
},
|
||||
{
|
||||
"text": "The transistor was invented in 1947, so the 50th anniversary of its invention was in 1997.",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What year was the 50th anniversary of the invention of the transistor?",
|
||||
"fill_history": {
|
||||
"question": "What year was the 50th anniversary of the invention of the transistor?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "palm2-bison",
|
||||
"emoji": "🦬",
|
||||
"formData": {
|
||||
"shortname": "PaLM2"
|
||||
},
|
||||
"key": "9b3f329c-9a8d-4bb9-bf72-12a7bd8ea27a",
|
||||
"model": "chat-bison-001",
|
||||
"name": "PaLM2",
|
||||
"settings": {
|
||||
"max_output_tokens": 512,
|
||||
"stop_sequences": [],
|
||||
"temperature": 0.5,
|
||||
"top_k": -1,
|
||||
"top_p": -1
|
||||
},
|
||||
"temp": 0.7
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "1997",
|
||||
"LLM_0": "PaLM2"
|
||||
@ -138,9 +380,28 @@
|
||||
},
|
||||
{
|
||||
"text": "5 to the power of 3 is 125.",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 5 to the power of 3?",
|
||||
"fill_history": {
|
||||
"question": "What is 5 to the power of 3?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "palm2-bison",
|
||||
"emoji": "🦬",
|
||||
"formData": {
|
||||
"shortname": "PaLM2"
|
||||
},
|
||||
"key": "9b3f329c-9a8d-4bb9-bf72-12a7bd8ea27a",
|
||||
"model": "chat-bison-001",
|
||||
"name": "PaLM2",
|
||||
"settings": {
|
||||
"max_output_tokens": 512,
|
||||
"stop_sequences": [],
|
||||
"temperature": 0.5,
|
||||
"top_k": -1,
|
||||
"top_p": -1
|
||||
},
|
||||
"temp": 0.7
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "125",
|
||||
"LLM_0": "PaLM2"
|
||||
@ -148,9 +409,28 @@
|
||||
},
|
||||
{
|
||||
"text": "2 multiplied by the cubic root of 729 is 27.",
|
||||
"prompt": "Answer the following math question as brief as possible, and only include the answer in your response: What is 2 multiplied by the cubic root of 729?",
|
||||
"fill_history": {
|
||||
"question": "What is 2 multiplied by the cubic root of 729?"
|
||||
},
|
||||
"llm": {
|
||||
"base_model": "palm2-bison",
|
||||
"emoji": "🦬",
|
||||
"formData": {
|
||||
"shortname": "PaLM2"
|
||||
},
|
||||
"key": "9b3f329c-9a8d-4bb9-bf72-12a7bd8ea27a",
|
||||
"model": "chat-bison-001",
|
||||
"name": "PaLM2",
|
||||
"settings": {
|
||||
"max_output_tokens": 512,
|
||||
"stop_sequences": [],
|
||||
"temperature": 0.5,
|
||||
"top_k": -1,
|
||||
"top_p": -1
|
||||
},
|
||||
"temp": 0.7
|
||||
},
|
||||
"metavars": {
|
||||
"Expected": "18",
|
||||
"LLM_0": "PaLM2"
|
||||
@ -350,7 +630,8 @@
|
||||
"data": {
|
||||
"code": "function evaluate(response) {\n return response.text.startsWith(\n response.meta['Expected']);\n}",
|
||||
"language": "javascript",
|
||||
"title": "Whether expected answer"
|
||||
"title": "Whether expected answer",
|
||||
"refresh": false
|
||||
},
|
||||
"dragging": false,
|
||||
"id": "evalNode-1688056068815",
|
||||
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -1,15 +1,15 @@
|
||||
{
|
||||
"files": {
|
||||
"main.css": "/static/css/main.fe68699e.css",
|
||||
"main.js": "/static/js/main.ec9948b8.js",
|
||||
"main.css": "/static/css/main.a4e8271c.css",
|
||||
"main.js": "/static/js/main.b59fa9d7.js",
|
||||
"static/js/787.4c72bb55.chunk.js": "/static/js/787.4c72bb55.chunk.js",
|
||||
"index.html": "/index.html",
|
||||
"main.fe68699e.css.map": "/static/css/main.fe68699e.css.map",
|
||||
"main.ec9948b8.js.map": "/static/js/main.ec9948b8.js.map",
|
||||
"main.a4e8271c.css.map": "/static/css/main.a4e8271c.css.map",
|
||||
"main.b59fa9d7.js.map": "/static/js/main.b59fa9d7.js.map",
|
||||
"787.4c72bb55.chunk.js.map": "/static/js/787.4c72bb55.chunk.js.map"
|
||||
},
|
||||
"entrypoints": [
|
||||
"static/css/main.fe68699e.css",
|
||||
"static/js/main.ec9948b8.js"
|
||||
"static/css/main.a4e8271c.css",
|
||||
"static/js/main.b59fa9d7.js"
|
||||
]
|
||||
}
|
@ -1 +1 @@
|
||||
<!doctype html><html lang="en"><head><meta charset="utf-8"/><script async src="https://www.googletagmanager.com/gtag/js?id=G-RN3FDBLMCR"></script><script>function gtag(){dataLayer.push(arguments)}window.dataLayer=window.dataLayer||[],gtag("js",new Date),gtag("config","G-RN3FDBLMCR")</script><link rel="icon" href="/favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="A visual programming environment for prompt engineering"/><link rel="apple-touch-icon" href="/logo192.png"/><link rel="manifest" href="/manifest.json"/><title>ChainForge</title><script defer="defer" src="/static/js/main.ec9948b8.js"></script><link href="/static/css/main.fe68699e.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
|
||||
<!doctype html><html lang="en"><head><meta charset="utf-8"/><script async src="https://www.googletagmanager.com/gtag/js?id=G-RN3FDBLMCR"></script><script>function gtag(){dataLayer.push(arguments)}window.dataLayer=window.dataLayer||[],gtag("js",new Date),gtag("config","G-RN3FDBLMCR")</script><link rel="icon" href="/favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="A visual programming environment for prompt engineering"/><link rel="apple-touch-icon" href="/logo192.png"/><link rel="manifest" href="/manifest.json"/><title>ChainForge</title><script defer="defer" src="/static/js/main.b59fa9d7.js"></script><link href="/static/css/main.a4e8271c.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -73,14 +73,6 @@ License: MIT
|
||||
* @license MIT
|
||||
*/
|
||||
|
||||
/*!
|
||||
* decimal.js v10.4.3
|
||||
* An arbitrary-precision Decimal type for JavaScript.
|
||||
* https://github.com/MikeMcl/decimal.js
|
||||
* Copyright (c) 2022 Michael Mclaughlin <M8ch88l@gmail.com>
|
||||
* MIT Licence
|
||||
*/
|
||||
|
||||
/*!
|
||||
* The buffer module from node.js, for the browser.
|
||||
*
|
||||
@ -123,21 +115,6 @@ License: MIT
|
||||
|
||||
/*! sheetjs (C) 2013-present SheetJS -- http://sheetjs.com */
|
||||
|
||||
/**
|
||||
* @license Complex.js v2.1.1 12/05/2020
|
||||
*
|
||||
* Copyright (c) 2020, Robert Eisele (robert@xarg.org)
|
||||
* Dual licensed under the MIT or GPL Version 2 licenses.
|
||||
**/
|
||||
|
||||
/**
|
||||
* @license Fraction.js v4.2.0 05/03/2022
|
||||
* https://www.xarg.org/2014/03/rational-numbers-in-javascript/
|
||||
*
|
||||
* Copyright (c) 2021, Robert Eisele (robert@xarg.org)
|
||||
* Dual licensed under the MIT or GPL Version 2 licenses.
|
||||
**/
|
||||
|
||||
/**
|
||||
* @license React
|
||||
* react-dom.production.min.js
|
File diff suppressed because one or more lines are too long
2
chainforge/react-server/src/AlertModal.js
vendored
2
chainforge/react-server/src/AlertModal.js
vendored
@ -20,7 +20,7 @@ const AlertModal = forwardRef((props, ref) => {
|
||||
}));
|
||||
|
||||
return (
|
||||
<Modal opened={opened} onClose={close} title="Error" styles={{header: {backgroundColor: '#E52A2A', color: 'white'}, root: {position: 'relative', left: '-80px'}}}>
|
||||
<Modal opened={opened} onClose={close} title="Error" styles={{header: {backgroundColor: '#E52A2A', color: 'white'}, root: {position: 'relative', left: '-5%'}}}>
|
||||
<p style={{whiteSpace: 'pre-line'}}>{alertMsg}</p>
|
||||
</Modal>
|
||||
);
|
||||
|
66
chainforge/react-server/src/App.js
vendored
66
chainforge/react-server/src/App.js
vendored
@ -6,9 +6,9 @@ import ReactFlow, {
|
||||
Controls,
|
||||
Background,
|
||||
} from 'react-flow-renderer';
|
||||
import { Button, Menu, LoadingOverlay, Text, Box, List, Loader } from '@mantine/core';
|
||||
import { Button, Menu, LoadingOverlay, Text, Box, List, Loader, Header, Chip, Badge, Card, Accordion, Tooltip } from '@mantine/core';
|
||||
import { useClipboard } from '@mantine/hooks';
|
||||
import { IconSettings, IconTextPlus, IconTerminal, IconCsv, IconSettingsAutomation, IconFileSymlink } from '@tabler/icons-react';
|
||||
import { IconSettings, IconTextPlus, IconTerminal, IconCsv, IconSettingsAutomation, IconFileSymlink, IconRobot } from '@tabler/icons-react';
|
||||
import TextFieldsNode from './TextFieldsNode'; // Import a custom node
|
||||
import PromptNode from './PromptNode';
|
||||
import EvaluatorNode from './EvaluatorNode';
|
||||
@ -22,6 +22,7 @@ import CommentNode from './CommentNode';
|
||||
import GlobalSettingsModal from './GlobalSettingsModal';
|
||||
import ExampleFlowsModal from './ExampleFlowsModal';
|
||||
import AreYouSureModal from './AreYouSureModal';
|
||||
import LLMEvaluatorNode from './LLMEvalNode';
|
||||
import { getDefaultModelFormData, getDefaultModelSettings } from './ModelSettingSchemas';
|
||||
import { v4 as uuid } from 'uuid';
|
||||
import LZString from 'lz-string';
|
||||
@ -73,7 +74,9 @@ const INITIAL_LLM = () => {
|
||||
const nodeTypes = {
|
||||
textfields: TextFieldsNode, // Register the custom node
|
||||
prompt: PromptNode,
|
||||
chat: PromptNode,
|
||||
evaluator: EvaluatorNode,
|
||||
llmeval: LLMEvaluatorNode,
|
||||
vis: VisNode,
|
||||
inspect: InspectNode,
|
||||
script: ScriptNode,
|
||||
@ -157,14 +160,18 @@ const App = () => {
|
||||
return ({x: -x+centerX, y:-y+centerY});
|
||||
}
|
||||
|
||||
const addTextFieldsNode = (event) => {
|
||||
const addTextFieldsNode = () => {
|
||||
const { x, y } = getViewportCenter();
|
||||
addNode({ id: 'textFieldsNode-'+Date.now(), type: 'textfields', data: {}, position: {x: x-200, y:y-100} });
|
||||
};
|
||||
const addPromptNode = (event) => {
|
||||
const addPromptNode = () => {
|
||||
const { x, y } = getViewportCenter();
|
||||
addNode({ id: 'promptNode-'+Date.now(), type: 'prompt', data: { prompt: '' }, position: {x: x-200, y:y-100} });
|
||||
};
|
||||
const addChatTurnNode = () => {
|
||||
const { x, y } = getViewportCenter();
|
||||
addNode({ id: 'chatTurn-'+Date.now(), type: 'chat', data: { prompt: '' }, position: {x: x-200, y:y-100} });
|
||||
};
|
||||
const addEvalNode = (progLang) => {
|
||||
const { x, y } = getViewportCenter();
|
||||
let code = "";
|
||||
@ -198,6 +205,10 @@ const App = () => {
|
||||
const { x, y } = getViewportCenter();
|
||||
addNode({ id: 'comment-'+Date.now(), type: 'comment', data: {}, position: {x: x-200, y:y-100} });
|
||||
};
|
||||
const addLLMEvalNode = () => {
|
||||
const { x, y } = getViewportCenter();
|
||||
addNode({ id: 'llmeval-'+Date.now(), type: 'llmeval', data: {}, position: {x: x-200, y:y-100} });
|
||||
};
|
||||
|
||||
const onClickExamples = () => {
|
||||
if (examplesModal && examplesModal.current)
|
||||
@ -662,27 +673,30 @@ const App = () => {
|
||||
</Box>
|
||||
</Modal> */}
|
||||
|
||||
<div style={{ height: '100vh', width: '100%', backgroundColor: '#eee' }}>
|
||||
<ReactFlow
|
||||
onNodesChange={onNodesChange}
|
||||
onEdgesChange={onEdgesChange}
|
||||
onConnect={onConnect}
|
||||
nodes={nodes}
|
||||
edges={edges}
|
||||
nodeTypes={nodeTypes}
|
||||
zoomOnPinch={false}
|
||||
zoomOnScroll={false}
|
||||
panOnScroll={true}
|
||||
// connectionLineComponent={AnimatedConnectionLine}
|
||||
// connectionLineStyle={connectionLineStyle}
|
||||
snapToGrid={true}
|
||||
snapGrid={snapGrid}
|
||||
onInit={onInit}
|
||||
>
|
||||
<Background color="#999" gap={16} />
|
||||
<Controls showZoom={true} />
|
||||
</ReactFlow>
|
||||
<div id='cf-root-container' style={{display: 'flex', height: '100vh'}}>
|
||||
<div style={{ height: '100%', backgroundColor: '#eee', flexGrow: '1' }}>
|
||||
<ReactFlow
|
||||
onNodesChange={onNodesChange}
|
||||
onEdgesChange={onEdgesChange}
|
||||
onConnect={onConnect}
|
||||
nodes={nodes}
|
||||
edges={edges}
|
||||
nodeTypes={nodeTypes}
|
||||
zoomOnPinch={false}
|
||||
zoomOnScroll={false}
|
||||
panOnScroll={true}
|
||||
// connectionLineComponent={AnimatedConnectionLine}
|
||||
// connectionLineStyle={connectionLineStyle}
|
||||
snapToGrid={true}
|
||||
snapGrid={snapGrid}
|
||||
onInit={onInit}
|
||||
>
|
||||
<Background color="#999" gap={16} />
|
||||
<Controls showZoom={true} />
|
||||
</ReactFlow>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="custom-controls" style={{position: 'fixed', left: '10px', top: '10px', zIndex:8}}>
|
||||
<Menu transitionProps={{ transition: 'pop-top-left' }}
|
||||
position="top-start"
|
||||
@ -694,15 +708,17 @@ const App = () => {
|
||||
</Menu.Target>
|
||||
<Menu.Dropdown>
|
||||
<Menu.Item onClick={addTextFieldsNode} icon={<IconTextPlus size="16px" />}> TextFields </Menu.Item>
|
||||
<Menu.Item onClick={addTabularDataNode} icon={'🗂️'}> Tabular Data Node </Menu.Item>
|
||||
<Menu.Item onClick={addPromptNode} icon={'💬'}> Prompt Node </Menu.Item>
|
||||
<Menu.Item onClick={addChatTurnNode} icon={'🗣'}> Chat Turn Node </Menu.Item>
|
||||
<Menu.Item onClick={() => addEvalNode('javascript')} icon={<IconTerminal size="16px" />}> JavaScript Evaluator Node </Menu.Item>
|
||||
{IS_RUNNING_LOCALLY ? (
|
||||
<Menu.Item onClick={() => addEvalNode('python')} icon={<IconTerminal size="16px" />}> Python Evaluator Node </Menu.Item>
|
||||
): <></>}
|
||||
<Menu.Item onClick={addLLMEvalNode} icon={<IconRobot size="16px" />}> LLM Scorer Node</Menu.Item>
|
||||
<Menu.Item onClick={addVisNode} icon={'📊'}> Vis Node </Menu.Item>
|
||||
<Menu.Item onClick={addInspectNode} icon={'🔍'}> Inspect Node </Menu.Item>
|
||||
<Menu.Item onClick={addCsvNode} icon={<IconCsv size="16px" />}> CSV Node </Menu.Item>
|
||||
<Menu.Item onClick={addTabularDataNode} icon={'🗂️'}> Tabular Data Node </Menu.Item>
|
||||
<Menu.Item onClick={addCommentNode} icon={'✏️'}> Comment Node </Menu.Item>
|
||||
{IS_RUNNING_LOCALLY ? (
|
||||
<Menu.Item onClick={addScriptNode} icon={<IconSettingsAutomation size="16px" />}> Global Python Scripts </Menu.Item>
|
||||
|
@ -21,7 +21,7 @@ const AreYouSureModal = forwardRef(({title, message, onConfirm}, ref) => {
|
||||
}
|
||||
|
||||
return (
|
||||
<Modal opened={opened} onClose={close} title={title} styles={{header: {backgroundColor: 'red', color: 'white'}, root: {position: 'relative', left: '-80px'}}}>
|
||||
<Modal opened={opened} onClose={close} title={title} styles={{header: {backgroundColor: 'red', color: 'white'}, root: {position: 'relative', left: '-5%'}}}>
|
||||
<Box maw={400} mx="auto" mt='md' mb='md'>
|
||||
<Text>{description}</Text>
|
||||
</Box>
|
||||
|
16
chainforge/react-server/src/ChatHistoryView.js
vendored
Normal file
16
chainforge/react-server/src/ChatHistoryView.js
vendored
Normal file
@ -0,0 +1,16 @@
|
||||
import { forwardRef } from "react";
|
||||
|
||||
const ChatHistoryView = forwardRef(({ messages, bgColors }, ref) => {
|
||||
const _bg_color = bgColors !== undefined ? bgColors : ['#333', '#333'];
|
||||
return (
|
||||
<div ref={ref} className="chat-history">
|
||||
{messages.map((msg, idx) => (
|
||||
<div className={"chat-bubble chat-msg-" + (idx % 2 === 0 ? "left" : "right")} style={{ backgroundColor: _bg_color[idx % 2] }}>
|
||||
{msg}
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
);
|
||||
});
|
||||
|
||||
export default ChatHistoryView;
|
2
chainforge/react-server/src/EvaluatorNode.js
vendored
2
chainforge/react-server/src/EvaluatorNode.js
vendored
@ -265,7 +265,7 @@ const EvaluatorNode = ({ data, id }) => {
|
||||
</Tooltip>]}
|
||||
/>
|
||||
<LLMResponseInspectorModal ref={inspectModal} jsonResponses={lastResponses} />
|
||||
<Modal title={default_header} size='60%' opened={infoModalOpened} onClose={closeInfoModal} styles={{header: {backgroundColor: '#FFD700'}, root: {position: 'relative', left: '-80px'}}}>
|
||||
<Modal title={default_header} size='60%' opened={infoModalOpened} onClose={closeInfoModal} styles={{header: {backgroundColor: '#FFD700'}, root: {position: 'relative', left: '-5%'}}}>
|
||||
<Box m='lg' mt='xl'>
|
||||
<Text mb='sm'>To use a {default_header}, write a function <Code>evaluate</Code> that takes a single argument of class <Code>ResponseInfo</Code>.
|
||||
The function should return a 'score' for that response, which usually is a number or a boolean value (strings as categoricals are supported, but experimental).</Text>
|
||||
|
@ -241,7 +241,7 @@ const ExampleFlowsModal = forwardRef((props, ref) => {
|
||||
}));
|
||||
|
||||
return (
|
||||
<Modal size='xl' opened={opened} onClose={close} title={<div><IconChartDots3 size={24} style={{position:'relative', marginRight: '8px', top: '4px'}} /><span style={{fontSize: '14pt'}}>Example Flows</span></div>} closeOnClickOutside={true} style={{position: 'relative', 'left': '-100px'}}>
|
||||
<Modal size='xl' opened={opened} onClose={close} title={<div><IconChartDots3 size={24} style={{position:'relative', marginRight: '8px', top: '4px'}} /><span style={{fontSize: '14pt'}}>Example Flows</span></div>} closeOnClickOutside={true} style={{position: 'relative', 'left': '-5%'}}>
|
||||
<Tabs defaultValue="examples">
|
||||
<Tabs.List>
|
||||
<Tabs.Tab value="examples" >Basic Examples</Tabs.Tab>
|
||||
@ -260,7 +260,7 @@ const ExampleFlowsModal = forwardRef((props, ref) => {
|
||||
filename="prompt-injection-test"
|
||||
onSelect={onSelect}
|
||||
/>
|
||||
<ExampleFlowCard title="Use an LLM as an evaluator"
|
||||
<ExampleFlowCard title="Chain prompts together"
|
||||
description="Chain one prompt into another to extract entities from a text response. Plots number of entities."
|
||||
filename="chaining-prompts"
|
||||
onSelect={onSelect}
|
||||
|
2
chainforge/react-server/src/InspectorNode.js
vendored
2
chainforge/react-server/src/InspectorNode.js
vendored
@ -1,7 +1,7 @@
|
||||
import React, { useState, useEffect } from 'react';
|
||||
import { Handle } from 'react-flow-renderer';
|
||||
import useStore from './store';
|
||||
import NodeLabel from './NodeLabelComponent'
|
||||
import NodeLabel from './NodeLabelComponent';
|
||||
import LLMResponseInspector, { exportToExcel } from './LLMResponseInspector';
|
||||
import fetch_from_backend from './fetch_from_backend';
|
||||
|
||||
|
200
chainforge/react-server/src/LLMEvalNode.js
vendored
Normal file
200
chainforge/react-server/src/LLMEvalNode.js
vendored
Normal file
@ -0,0 +1,200 @@
|
||||
import React, { useState, useCallback, useRef, useEffect } from 'react';
|
||||
import { Handle } from 'react-flow-renderer';
|
||||
import { Button, Alert, Progress, Textarea } from '@mantine/core';
|
||||
import { IconAlertTriangle, IconRobot, IconSearch } from "@tabler/icons-react";
|
||||
import { v4 as uuid } from 'uuid';
|
||||
import useStore from './store';
|
||||
import NodeLabel from './NodeLabelComponent';
|
||||
import fetch_from_backend from './fetch_from_backend';
|
||||
import { AvailableLLMs, getDefaultModelSettings } from './ModelSettingSchemas';
|
||||
import { LLMListContainer } from './LLMListComponent';
|
||||
import LLMResponseInspectorModal from './LLMResponseInspectorModal';
|
||||
|
||||
// The default prompt shown in gray highlights to give people a good example of an evaluation prompt.
|
||||
const PLACEHOLDER_PROMPT = "Respond with 'true' if the text below has a positive sentiment, and 'false' if not. Do not reply with anything else.";
|
||||
|
||||
// The default LLM annotator is GPT-4 at temperature 0.
|
||||
const DEFAULT_LLM_ITEM = (() => {
|
||||
let item = [AvailableLLMs.find(i => i.base_model === 'gpt-4')]
|
||||
.map((i) => ({key: uuid(), settings: getDefaultModelSettings(i.base_model), ...i}))[0];
|
||||
item.settings.temperature = 0.0;
|
||||
return item;
|
||||
})();
|
||||
|
||||
const LLMEvaluatorNode = ({ data, id }) => {
|
||||
|
||||
const [promptText, setPromptText] = useState(data.prompt || "");
|
||||
const [status, setStatus] = useState('none');
|
||||
const alertModal = useRef(null);
|
||||
const inspectModal = useRef(null);
|
||||
|
||||
const setDataPropsForNode = useStore((state) => state.setDataPropsForNode);
|
||||
const inputEdgesForNode = useStore((state) => state.inputEdgesForNode);
|
||||
const pingOutputNodes = useStore((state) => state.pingOutputNodes);
|
||||
const apiKeys = useStore((state) => state.apiKeys);
|
||||
|
||||
const [lastResponses, setLastResponses] = useState([]);
|
||||
|
||||
const [llmScorers, setLLMScorers] = useState([data.grader || DEFAULT_LLM_ITEM]);
|
||||
|
||||
// Progress when querying responses
|
||||
const [progress, setProgress] = useState(undefined);
|
||||
|
||||
const handleRunClick = useCallback(() => {
|
||||
// Get the ids from the connected input nodes:
|
||||
const input_node_ids = inputEdgesForNode(id).map(e => e.source);
|
||||
if (input_node_ids.length === 0) {
|
||||
console.warn("No inputs for evaluator node.");
|
||||
return;
|
||||
}
|
||||
|
||||
setStatus('loading');
|
||||
setProgress({success: 2, error: 0});
|
||||
|
||||
const llm_key = llmScorers[0].key;
|
||||
const handleError = (err) => {
|
||||
setStatus('error');
|
||||
setProgress(undefined);
|
||||
alertModal.current.trigger(err);
|
||||
};
|
||||
|
||||
// Fetch info about the number of queries we'll need to make
|
||||
fetch_from_backend('grabResponses', {
|
||||
responses: input_node_ids,
|
||||
}).then(function(json) {
|
||||
if (!json?.responses || json.responses.length === 0) {
|
||||
handleError('Error pulling input data for node: No input data found.');
|
||||
return;
|
||||
}
|
||||
|
||||
// Create progress listener
|
||||
const num_resps_required = json.responses.reduce((acc, resp_obj) => acc + resp_obj.responses.length, 0);
|
||||
const progress_listener = (progress_by_llm => {
|
||||
setProgress({
|
||||
success: 100 * progress_by_llm[llm_key].success / num_resps_required,
|
||||
error: 100 * progress_by_llm[llm_key].error / num_resps_required,
|
||||
})
|
||||
});
|
||||
|
||||
// Run LLM as evaluator
|
||||
fetch_from_backend('evalWithLLM', {
|
||||
id: id,
|
||||
llm: llmScorers[0],
|
||||
root_prompt: promptText + '\n```\n{input}\n```',
|
||||
responses: input_node_ids,
|
||||
api_keys: (apiKeys ? apiKeys : {}),
|
||||
progress_listener: progress_listener,
|
||||
}).then(function(json) {
|
||||
// Check if there's an error; if so, bubble it up to user and exit:
|
||||
if (!json || json.error) {
|
||||
handleError(json?.error || 'Unknown error encountered when requesting evaluations: empty response returned.');
|
||||
return;
|
||||
} else if (json.errors && json.errors.length > 0) {
|
||||
handleError(Object.values(json.errors[0])[0]);
|
||||
return;
|
||||
}
|
||||
|
||||
// Ping any vis + inspect nodes attached to this node to refresh their contents:
|
||||
pingOutputNodes(id);
|
||||
|
||||
console.log(json.responses);
|
||||
setLastResponses(json.responses);
|
||||
setStatus('ready');
|
||||
setProgress(undefined);
|
||||
}).catch(handleError);
|
||||
});
|
||||
}, [inputEdgesForNode, promptText, llmScorers, apiKeys, pingOutputNodes, setStatus, alertModal]);
|
||||
|
||||
const handlePromptChange = useCallback((event) => {
|
||||
// Store prompt text
|
||||
setPromptText(event.target.value);
|
||||
setDataPropsForNode(id, { prompt: event.target.value });
|
||||
setStatus('warning');
|
||||
}, [setPromptText, setDataPropsForNode, setStatus, id]);
|
||||
|
||||
const onLLMListItemsChange = useCallback((new_items) => {
|
||||
setLLMScorers(new_items);
|
||||
|
||||
if (new_items.length > 0)
|
||||
setDataPropsForNode(id, { grader: new_items[0] });
|
||||
}, []);
|
||||
|
||||
const showResponseInspector = useCallback(() => {
|
||||
if (inspectModal && inspectModal.current && lastResponses)
|
||||
inspectModal.current.trigger();
|
||||
}, [inspectModal, lastResponses]);
|
||||
|
||||
useEffect(() => {
|
||||
if (data.refresh && data.refresh === true) {
|
||||
setDataPropsForNode(id, { refresh: false });
|
||||
setStatus('warning');
|
||||
}
|
||||
}, [data]);
|
||||
|
||||
return (
|
||||
<div className="evaluator-node cfnode">
|
||||
<NodeLabel title={data.title || 'LLM Scorer'}
|
||||
nodeId={id}
|
||||
icon={<IconRobot size="16px" />}
|
||||
status={status}
|
||||
alertModal={alertModal}
|
||||
handleRunClick={handleRunClick}
|
||||
runButtonTooltip="Run scorer over inputs" />
|
||||
<LLMResponseInspectorModal ref={inspectModal} jsonResponses={lastResponses} />
|
||||
|
||||
<Textarea autosize
|
||||
label="Describe how to 'score' a single response."
|
||||
placeholder={PLACEHOLDER_PROMPT}
|
||||
description="The text of the response will be pasted directly below your rubric."
|
||||
className="prompt-field-fixed nodrag nowheel"
|
||||
minRows="4"
|
||||
maxRows="12"
|
||||
maw='290px'
|
||||
mb='lg'
|
||||
value={promptText}
|
||||
onChange={handlePromptChange} />
|
||||
|
||||
<LLMListContainer
|
||||
initLLMItems={llmScorers}
|
||||
description="Model to use as scorer:"
|
||||
modelSelectButtonText="Change"
|
||||
selectModelAction="replace"
|
||||
onAddModel={() => {}}
|
||||
onItemsChange={onLLMListItemsChange} />
|
||||
|
||||
{progress !== undefined ?
|
||||
(<Progress animate={true} sections={[
|
||||
{ value: progress.success, color: 'blue', tooltip: 'API call succeeded' },
|
||||
{ value: progress.error, color: 'red', tooltip: 'Error collecting response' }
|
||||
]} />)
|
||||
: <></>}
|
||||
|
||||
<Alert icon={<IconAlertTriangle size="1rem" />} p='10px' radius='xs' title="Caution" color="yellow" maw='270px' mt='xs' styles={{title: {margin: '0px'}, icon: {marginRight: '4px'}, message: {fontSize: '10pt'}}}>
|
||||
AI scores are not 100% accurate.
|
||||
</Alert>
|
||||
|
||||
<Handle
|
||||
type="target"
|
||||
position="left"
|
||||
id="responseBatch"
|
||||
className="grouped-handle"
|
||||
style={{ top: '50%' }}
|
||||
/>
|
||||
<Handle
|
||||
type="source"
|
||||
position="right"
|
||||
id="output"
|
||||
className="grouped-handle"
|
||||
style={{ top: '50%' }}
|
||||
/>
|
||||
|
||||
{ lastResponses && lastResponses.length > 0 ?
|
||||
(<div className="eval-inspect-response-footer nodrag" onClick={showResponseInspector} style={{display: 'flex', justifyContent:'center'}}>
|
||||
<Button color='blue' variant='subtle' w='100%' >Inspect scores <IconSearch size='12pt'/></Button>
|
||||
</div>) : <></>}
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default LLMEvaluatorNode;
|
||||
|
136
chainforge/react-server/src/LLMListComponent.js
vendored
136
chainforge/react-server/src/LLMListComponent.js
vendored
@ -1,9 +1,17 @@
|
||||
import { useState, useEffect, useCallback, useRef } from "react";
|
||||
import { useState, useEffect, useCallback, useRef, forwardRef, useImperativeHandle } from "react";
|
||||
import { DragDropContext, Draggable } from "react-beautiful-dnd";
|
||||
import { Menu } from "@mantine/core";
|
||||
import { v4 as uuid } from 'uuid';
|
||||
import LLMListItem, { LLMListItemClone } from "./LLMListItem";
|
||||
import { StrictModeDroppable } from './StrictModeDroppable'
|
||||
import ModelSettingsModal from "./ModelSettingsModal"
|
||||
import { StrictModeDroppable } from './StrictModeDroppable';
|
||||
import ModelSettingsModal from "./ModelSettingsModal";
|
||||
import { getDefaultModelSettings, AvailableLLMs } from './ModelSettingSchemas';
|
||||
|
||||
// The LLM(s) to include by default on a PromptNode whenever one is created.
|
||||
// Defaults to ChatGPT (GPT3.5) when running locally, and HF-hosted falcon-7b for online version since it's free.
|
||||
const DEFAULT_INIT_LLMS = [AvailableLLMs[0]];
|
||||
|
||||
// Helper funcs
|
||||
// Ensure that a name is 'unique'; if not, return an amended version with a count tacked on (e.g. "GPT-4 (2)")
|
||||
const ensureUniqueName = (_name, _prev_names) => {
|
||||
// Strip whitespace around names
|
||||
@ -24,7 +32,7 @@ const ensureUniqueName = (_name, _prev_names) => {
|
||||
return new_name;
|
||||
};
|
||||
|
||||
export default function LLMList({llms, onItemsChange}) {
|
||||
export function LLMList({llms, onItemsChange}) {
|
||||
const [items, setItems] = useState(llms);
|
||||
const settingsModal = useRef(null);
|
||||
const [selectedModel, setSelectedModel] = useState(undefined);
|
||||
@ -110,14 +118,14 @@ export default function LLMList({llms, onItemsChange}) {
|
||||
// When LLMs list changes, we need to add new items
|
||||
// while preserving the current order of 'items'.
|
||||
// Check for new items and for each, add to end:
|
||||
let new_items = Array.from(items);
|
||||
let new_items = Array.from(items.filter(i => llms.some(v => v.key === i.key)));
|
||||
llms.forEach(item => {
|
||||
if (!items.find(i => i.key === item.key))
|
||||
new_items.push(item);
|
||||
});
|
||||
|
||||
updateItems(new_items);
|
||||
}, [llms, updateItems]);
|
||||
}, [llms]);
|
||||
|
||||
return (
|
||||
<div className="list nowheel nodrag">
|
||||
@ -146,3 +154,119 @@ export default function LLMList({llms, onItemsChange}) {
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
export const LLMListContainer = forwardRef(({description, modelSelectButtonText, initLLMItems, onSelectModel, selectModelAction, onItemsChange}, ref) => {
|
||||
|
||||
// Selecting LLM models to prompt
|
||||
const [llmItems, setLLMItems] = useState(initLLMItems || DEFAULT_INIT_LLMS.map((i) => ({key: uuid(), settings: getDefaultModelSettings(i.base_model), ...i})));
|
||||
const [llmItemsCurrState, setLLMItemsCurrState] = useState([]);
|
||||
const resetLLMItemsProgress = useCallback(() => {
|
||||
setLLMItems(llmItemsCurrState.map(item => {
|
||||
item.progress = undefined;
|
||||
return item;
|
||||
}));
|
||||
}, [llmItemsCurrState]);
|
||||
const setZeroPercProgress = useCallback(() => {
|
||||
setLLMItems(llmItemsCurrState.map(item => {
|
||||
item.progress = { success: 0, error: 0 };
|
||||
return item;
|
||||
}));
|
||||
}, [llmItemsCurrState]);
|
||||
const updateProgress = useCallback((itemProcessorFunc) => {
|
||||
setLLMItems(llmItemsCurrState.map(itemProcessorFunc));
|
||||
}, [llmItemsCurrState]);
|
||||
const ensureLLMItemsErrorProgress = useCallback((llm_keys_w_errors) => {
|
||||
setLLMItems(llmItemsCurrState.map(item => {
|
||||
if (llm_keys_w_errors.includes(item.key)) {
|
||||
if (!item.progress)
|
||||
item.progress = { success: 0, error: 100 };
|
||||
else {
|
||||
const succ_perc = item.progress.success;
|
||||
item.progress = { success: succ_perc, error: 100 - succ_perc };
|
||||
}
|
||||
} else {
|
||||
if (item.progress && item.progress.success === 0)
|
||||
item.progress = undefined;
|
||||
}
|
||||
|
||||
return item;
|
||||
}));
|
||||
}, [llmItemsCurrState]);
|
||||
|
||||
const getLLMListItemForKey = useCallback((key) => {
|
||||
return llmItemsCurrState.find((item) => item.key === key);
|
||||
}, [llmItemsCurrState]);
|
||||
|
||||
const handleSelectModel = useCallback((model) => {
|
||||
// Get the item for that model
|
||||
let item = AvailableLLMs.find(llm => llm.base_model === model);
|
||||
if (!item) { // This should never trigger, but in case it does:
|
||||
console.error(`Could not find model named '${model}' in list of available LLMs.`);
|
||||
return;
|
||||
}
|
||||
|
||||
// Give it a uid as a unique key (this is needed for the draggable list to support multiple same-model items; keys must be unique)
|
||||
item = {key: uuid(), ...item};
|
||||
|
||||
// Generate the default settings for this model
|
||||
item.settings = getDefaultModelSettings(model);
|
||||
|
||||
// Repair names to ensure they are unique
|
||||
const unique_name = ensureUniqueName(item.name, llmItemsCurrState.map(i => i.name));
|
||||
item.name = unique_name;
|
||||
item.formData = { 'shortname': unique_name };
|
||||
|
||||
let new_items;
|
||||
if (selectModelAction === "add" || selectModelAction === undefined) {
|
||||
// Add model to the LLM list (regardless of it's present already or not).
|
||||
new_items = llmItemsCurrState.concat([item]);
|
||||
} else if (selectModelAction === "replace") {
|
||||
// Remove existing model from LLM list and replace with new one:
|
||||
new_items = [item];
|
||||
}
|
||||
|
||||
setLLMItems(new_items);
|
||||
if (onSelectModel) onSelectModel(item, new_items);
|
||||
}, [llmItemsCurrState, onSelectModel, selectModelAction]);
|
||||
|
||||
const onLLMListItemsChange = useCallback((new_items) => {
|
||||
setLLMItemsCurrState(new_items);
|
||||
if (onItemsChange) onItemsChange(new_items, llmItemsCurrState);
|
||||
}, [setLLMItemsCurrState, onItemsChange]);
|
||||
|
||||
// This gives the parent access to triggering methods on this object
|
||||
useImperativeHandle(ref, () => ({
|
||||
resetLLMItemsProgress,
|
||||
setZeroPercProgress,
|
||||
updateProgress,
|
||||
ensureLLMItemsErrorProgress,
|
||||
getLLMListItemForKey,
|
||||
}));
|
||||
|
||||
return (<div className="llm-list-container nowheel">
|
||||
<div className="llm-list-backdrop">
|
||||
{description || "Models to query:"}
|
||||
<div className="add-llm-model-btn nodrag">
|
||||
<Menu transitionProps={{ transition: 'pop-top-left' }}
|
||||
position="bottom-start"
|
||||
width={220}
|
||||
withinPortal={true}
|
||||
>
|
||||
<Menu.Target>
|
||||
<button>{modelSelectButtonText || "Add +"}</button>
|
||||
</Menu.Target>
|
||||
<Menu.Dropdown>
|
||||
{AvailableLLMs.map(item => (
|
||||
<Menu.Item key={item.model} onClick={() => handleSelectModel(item.base_model)} icon={item.emoji}>{item.name}</Menu.Item>))
|
||||
}
|
||||
</Menu.Dropdown>
|
||||
</Menu>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="nodrag">
|
||||
<LLMList llms={llmItems} onItemsChange={onLLMListItemsChange} />
|
||||
</div>
|
||||
</div>);
|
||||
});
|
@ -7,7 +7,7 @@
|
||||
import React, { useState, useEffect, useRef } from 'react';
|
||||
import { Collapse, Radio, MultiSelect, Group, Table, NativeSelect } from '@mantine/core';
|
||||
import { useDisclosure } from '@mantine/hooks';
|
||||
import { IconTable, IconSitemap } from '@tabler/icons-react';
|
||||
import { IconTable, IconLayoutList } from '@tabler/icons-react';
|
||||
import * as XLSX from 'xlsx';
|
||||
import useStore from './store';
|
||||
import { filterDict } from './backend/utils';
|
||||
@ -36,30 +36,39 @@ const countResponsesBy = (responses, keyFunc) => {
|
||||
let responses_by_key = {};
|
||||
let unspecified_group = [];
|
||||
responses.forEach(item => {
|
||||
const key = keyFunc(item);
|
||||
const d = key !== null ? responses_by_key : unspecified_group;
|
||||
if (key in d)
|
||||
d[key] += 1;
|
||||
else
|
||||
d[key] = 1;
|
||||
const key = keyFunc(item);
|
||||
const d = key !== null ? responses_by_key : unspecified_group;
|
||||
if (key in d)
|
||||
d[key] += 1;
|
||||
else
|
||||
d[key] = 1;
|
||||
});
|
||||
return [responses_by_key, unspecified_group];
|
||||
};
|
||||
|
||||
const SUCCESS_EVAL_SCORES = new Set(['true', 'yes']);
|
||||
const FAILURE_EVAL_SCORES = new Set(['false', 'no']);
|
||||
const getEvalResultStr = (eval_item) => {
|
||||
if (Array.isArray(eval_item)) {
|
||||
return 'scores: ' + eval_item.join(', ');
|
||||
return 'scores: ' + eval_item.join(', ');
|
||||
}
|
||||
else if (typeof eval_item === 'object') {
|
||||
const strs = Object.keys(eval_item).map(key => {
|
||||
let val = eval_item[key];
|
||||
if (typeof val === 'number' && val.toString().indexOf('.') > -1)
|
||||
val = val.toFixed(4); // truncate floats to 4 decimal places
|
||||
return `${key}: ${val}`;
|
||||
});
|
||||
return strs.join(', ');
|
||||
const strs = Object.keys(eval_item).map(key => {
|
||||
let val = eval_item[key];
|
||||
if (typeof val === 'number' && val.toString().indexOf('.') > -1)
|
||||
val = val.toFixed(4); // truncate floats to 4 decimal places
|
||||
return `${key}: ${val}`;
|
||||
});
|
||||
return strs.join(', ');
|
||||
}
|
||||
else {
|
||||
const eval_str = eval_item.toString().trim().toLowerCase();
|
||||
const color = SUCCESS_EVAL_SCORES.has(eval_str) ? 'black' : (FAILURE_EVAL_SCORES.has(eval_str) ? 'red' : 'black');
|
||||
return (<>
|
||||
<span style={{color: 'gray'}}>{"score: "}</span>
|
||||
<span style={{color: color}}>{eval_str}</span>
|
||||
</>);
|
||||
}
|
||||
else
|
||||
return `score: ${eval_item}`;
|
||||
};
|
||||
|
||||
// Export the JSON responses to an excel file (downloads the file):
|
||||
@ -128,7 +137,7 @@ const LLMResponseInspector = ({ jsonResponses, wideFormat }) => {
|
||||
const [receivedResponsesOnce, setReceivedResponsesOnce] = useState(false);
|
||||
|
||||
// The type of view to use to display responses. Can be either hierarchy or table.
|
||||
const [viewFormat, setViewFormat] = useState(wideFormat ? "table" : "hierarchy");
|
||||
const [viewFormat, setViewFormat] = useState("hierarchy");
|
||||
|
||||
// The MultiSelect so people can dynamically set what vars they care about
|
||||
const [multiSelectVars, setMultiSelectVars] = useState([]);
|
||||
@ -296,10 +305,8 @@ const LLMResponseInspector = ({ jsonResponses, wideFormat }) => {
|
||||
const sel_var_cols = found_sel_var_vals.map(val => {
|
||||
if (val in resp_objs_by_col_var) {
|
||||
const rs = resp_objs_by_col_var[val];
|
||||
if (rs.length > 1)
|
||||
console.warn(`Found more than one response object for LLM ${val} for the same prompt. Only displaying first...`);
|
||||
// Return response divs as response box here:
|
||||
return generateResponseBoxes(rs, var_cols, 100)[0];
|
||||
return (<div>{generateResponseBoxes(rs, var_cols, 100)}</div>);
|
||||
} else {
|
||||
console.warn(`Could not find response object for column variable ${tableColVar} with value ${val}`);
|
||||
return (<i>(no data)</i>);
|
||||
@ -307,16 +314,16 @@ const LLMResponseInspector = ({ jsonResponses, wideFormat }) => {
|
||||
});
|
||||
|
||||
return (
|
||||
<tr key={idx} style={{borderBottom: '8px solid #eee'}}>
|
||||
{var_cols_vals.map(c => (<td className='inspect-table-var'>{c}</td>))}
|
||||
{sel_var_cols.map((c, i) => (<td className='inspect-table-llm-resp'>{c}</td>))}
|
||||
<tr key={`r${idx}`} style={{borderBottom: '8px solid #eee'}}>
|
||||
{var_cols_vals.map((c, i) => (<td key={`v${i}`} className='inspect-table-var'>{c}</td>))}
|
||||
{sel_var_cols.map((c, i) => (<td key={`c${i}`} className='inspect-table-llm-resp'>{c}</td>))}
|
||||
</tr>
|
||||
);
|
||||
});
|
||||
|
||||
setResponses([(<Table>
|
||||
setResponses([(<Table key='table'>
|
||||
<thead>
|
||||
<tr>{colnames.map(c => (<th>{c}</th>))}</tr>
|
||||
<tr>{colnames.map(c => (<th key={c}>{c}</th>))}</tr>
|
||||
</thead>
|
||||
<tbody style={{verticalAlign: 'top'}}>{rows}</tbody>
|
||||
</Table>)]);
|
||||
@ -420,8 +427,8 @@ const LLMResponseInspector = ({ jsonResponses, wideFormat }) => {
|
||||
onChange={setViewFormat}
|
||||
>
|
||||
<Group mt="0px" mb='xs'>
|
||||
<Radio value="table" label={<span><IconTable size='10pt' style={{marginBottom: '-1px'}}/> Table</span>} />
|
||||
<Radio value="hierarchy" label={<span><IconSitemap size='10pt' style={{marginBottom: '-1px'}}/> Hierarchy</span>} />
|
||||
<Radio value="hierarchy" label={<span><IconLayoutList size='10pt' style={{marginBottom: '-1px'}}/> Grouped List</span>} />
|
||||
<Radio value="table" label={<span><IconTable size='10pt' style={{marginBottom: '-1px'}}/> Table</span>} />
|
||||
</Group>
|
||||
</Radio.Group>
|
||||
: <></>}
|
||||
@ -444,14 +451,14 @@ const LLMResponseInspector = ({ jsonResponses, wideFormat }) => {
|
||||
<MultiSelect ref={multiSelectRef}
|
||||
onChange={handleMultiSelectValueChange}
|
||||
className='nodrag nowheel inspect-multiselect'
|
||||
label={<span style={{marginTop: '0px', fontWeight: 'normal'}}>Group responses by (order matters):</span>}
|
||||
label={<span style={{marginTop: '0px'}}>Group responses by (order matters):</span>}
|
||||
data={multiSelectVars}
|
||||
placeholder="Pick vars to group responses, in order of importance"
|
||||
size={wideFormat ? 'sm' : 'xs'}
|
||||
value={multiSelectValue}
|
||||
clearSearchOnChange={true}
|
||||
clearSearchOnBlur={true}
|
||||
w='100%' />
|
||||
w='100%' />
|
||||
</div>
|
||||
: <></>}
|
||||
|
||||
|
@ -19,7 +19,7 @@ const LLMResponseInspectorModal = forwardRef((props, ref) => {
|
||||
}));
|
||||
|
||||
return (
|
||||
<Modal size='90%' keepMounted opened={opened} onClose={close} closeOnClickOutside={true} style={{position: 'relative', 'left': '-100px'}} title={
|
||||
<Modal size='90%' keepMounted opened={opened} onClose={close} closeOnClickOutside={true} style={{position: 'relative', 'left': '-5%'}} title={
|
||||
<div><span>Response Inspector</span><button className="custom-button" style={{marginTop: 'auto', marginRight: '14px', float: 'right'}} onClick={() => exportToExcel(props.jsonResponses)}>Export data to Excel</button></div>
|
||||
} styles={{ title: {justifyContent: 'space-between', width: '100%'} }} >
|
||||
{ props.prompt !== undefined ?
|
||||
|
@ -646,7 +646,7 @@ const HuggingFaceTextInferenceSettings = {
|
||||
"type": "string",
|
||||
"title": "Model",
|
||||
"description": "Select a suggested HuggingFace-hosted model to query using the Inference API. For more details, check out https://huggingface.co/inference-api",
|
||||
"enum": ["gpt2", "bigscience/bloom-560m", "tiiuae/falcon-7b-instruct", "bigcode/santacoder", "bigcode/starcoder", "Other (HuggingFace)"],
|
||||
"enum": ["tiiuae/falcon-7b-instruct", "microsoft/DialoGPT-large", "bigscience/bloom-560m", "gpt2", "bigcode/santacoder", "bigcode/starcoder", "Other (HuggingFace)"],
|
||||
"default": "tiiuae/falcon-7b-instruct",
|
||||
},
|
||||
"custom_model": {
|
||||
@ -655,6 +655,13 @@ const HuggingFaceTextInferenceSettings = {
|
||||
"description": "(Only used if you select 'Other' above.) Enter the HuggingFace id of the text generation model you wish to query via the inference API. Alternatively, if you have hosted a model on HF Inference Endpoints, you can enter the full URL of the endpoint here.",
|
||||
"default": "",
|
||||
},
|
||||
"model_type": {
|
||||
"type": "string",
|
||||
"title": "Model Type (Text or Chat)",
|
||||
"description": "Select the type of model you are querying. You must selected 'chat' if you want to pass conversation history in Chat Turn nodes.",
|
||||
"enum": ["text", "chat"],
|
||||
"default": "text"
|
||||
},
|
||||
"temperature": {
|
||||
"type": "number",
|
||||
"title": "temperature",
|
||||
|
@ -121,7 +121,7 @@ return (
|
||||
</Popover.Dropdown>
|
||||
</Popover><span>{`Model Settings: ${modelName}`}</span>
|
||||
</div>
|
||||
} closeOnClickOutside={false} style={{position: 'relative', 'left': '-100px'}}>
|
||||
} closeOnClickOutside={false} style={{position: 'relative', 'left': '-5%'}}>
|
||||
|
||||
<Form schema={schema} uiSchema={uiSchema} formData={formData} validator={validator} onChange={onFormDataChange} onSubmit={onSubmit} style={{width: '100%'}}>
|
||||
<Button title='Submit' onClick={onClickSubmit} style={{float: 'right', marginRight: '30px'}}>Submit</Button>
|
||||
|
413
chainforge/react-server/src/PromptNode.js
vendored
413
chainforge/react-server/src/PromptNode.js
vendored
@ -1,42 +1,17 @@
|
||||
import React, { useEffect, useState, useRef, useCallback } from 'react';
|
||||
import React, { useEffect, useState, useRef, useCallback, useMemo } from 'react';
|
||||
import { Handle } from 'react-flow-renderer';
|
||||
import { Menu, Button, Progress, Textarea, Text, Popover, Center, Modal, Box, Tooltip } from '@mantine/core';
|
||||
import { Menu, Switch, Button, Progress, Textarea, Text, Popover, Center, Modal, Box, Tooltip } from '@mantine/core';
|
||||
import { useDisclosure } from '@mantine/hooks';
|
||||
import { v4 as uuid } from 'uuid';
|
||||
import { IconSearch, IconList } from '@tabler/icons-react';
|
||||
import useStore from './store';
|
||||
import NodeLabel from './NodeLabelComponent'
|
||||
import TemplateHooks, { extractBracketedSubstrings } from './TemplateHooksComponent'
|
||||
import LLMList from './LLMListComponent'
|
||||
import { LLMListContainer } from './LLMListComponent'
|
||||
import LLMResponseInspectorModal from './LLMResponseInspectorModal';
|
||||
import { getDefaultModelSettings, AvailableLLMs } from './ModelSettingSchemas'
|
||||
import fetch_from_backend from './fetch_from_backend';
|
||||
import { escapeBraces } from './backend/template';
|
||||
import { PromptTemplate, escapeBraces } from './backend/template';
|
||||
import ChatHistoryView from './ChatHistoryView';
|
||||
|
||||
// The LLM(s) to include by default on a PromptNode whenever one is created.
|
||||
// Defaults to ChatGPT (GPT3.5).
|
||||
const initLLMs = [AvailableLLMs[0]];
|
||||
|
||||
// Helper funcs
|
||||
// Ensure that a name is 'unique'; if not, return an amended version with a count tacked on (e.g. "GPT-4 (2)")
|
||||
const ensureUniqueName = (_name, _prev_names) => {
|
||||
// Strip whitespace around names
|
||||
const prev_names = _prev_names.map(n => n.trim());
|
||||
const name = _name.trim();
|
||||
|
||||
// Check if name is unique
|
||||
if (!prev_names.includes(name))
|
||||
return name;
|
||||
|
||||
// Name isn't unique; find a unique one:
|
||||
let i = 2;
|
||||
let new_name = `${name} (${i})`;
|
||||
while (prev_names.includes(new_name)) {
|
||||
i += 1;
|
||||
new_name = `${name} (${i})`;
|
||||
}
|
||||
return new_name;
|
||||
};
|
||||
const getUniqueLLMMetavarKey = (responses) => {
|
||||
const metakeys = new Set(responses.map(resp_obj => Object.keys(resp_obj.metavars)).flat());
|
||||
let i = 0;
|
||||
@ -44,7 +19,16 @@ const getUniqueLLMMetavarKey = (responses) => {
|
||||
i += 1;
|
||||
return `LLM_${i}`;
|
||||
};
|
||||
|
||||
const bucketChatHistoryInfosByLLM = (chat_hist_infos) => {
|
||||
let chats_by_llm = {};
|
||||
chat_hist_infos.forEach(chat_hist_info => {
|
||||
if (chat_hist_info.llm in chats_by_llm)
|
||||
chats_by_llm[chat_hist_info.llm].push(chat_hist_info);
|
||||
else
|
||||
chats_by_llm[chat_hist_info.llm] = [ chat_hist_info ];
|
||||
});
|
||||
return chats_by_llm;
|
||||
}
|
||||
|
||||
class PromptInfo {
|
||||
prompt; // string
|
||||
@ -87,7 +71,9 @@ const PromptListPopover = ({ promptInfos, onHover, onClick }) => {
|
||||
};
|
||||
|
||||
|
||||
const PromptNode = ({ data, id }) => {
|
||||
const PromptNode = ({ data, id, type: node_type }) => {
|
||||
const node_icon = useMemo(() => (node_type === 'chat' ? '🗣' : '💬'), [node_type]);
|
||||
const node_default_title = useMemo(() => (node_type === 'chat' ? 'Chat Turn' : 'Prompt Node'), [node_type]);
|
||||
|
||||
// Get state from the Zustand store:
|
||||
const edges = useStore((state) => state.edges);
|
||||
@ -107,47 +93,24 @@ const PromptNode = ({ data, id }) => {
|
||||
const [numGenerations, setNumGenerations] = useState(data.n || 1);
|
||||
const [numGenerationsLastRun, setNumGenerationsLastRun] = useState(data.n || 1);
|
||||
|
||||
// The LLM items container
|
||||
const llmListContainer = useRef(null);
|
||||
const [llmItemsCurrState, setLLMItemsCurrState] = useState([]);
|
||||
|
||||
// For displaying error messages to user
|
||||
const alertModal = useRef(null);
|
||||
|
||||
// For a way to inspect responses without having to attach a dedicated node
|
||||
const inspectModal = useRef(null);
|
||||
|
||||
// Chat node specific
|
||||
const [contChatWithPriorLLMs, setContChatWithPriorLLMs] = useState(data.contChat !== undefined ? data.contChat : true);
|
||||
const [contChatToggleDisabled, setContChatToggleDisabled] = useState(false);
|
||||
|
||||
// For an info pop-up that shows all the prompts that will be sent off
|
||||
// NOTE: This is the 'full' version of the PromptListPopover that activates on hover.
|
||||
const [infoModalOpened, { open: openInfoModal, close: closeInfoModal }] = useDisclosure(false);
|
||||
|
||||
// Selecting LLM models to prompt
|
||||
const [llmItems, setLLMItems] = useState(data.llms || initLLMs.map((i) => ({key: uuid(), settings: getDefaultModelSettings(i.base_model), ...i})));
|
||||
const [llmItemsCurrState, setLLMItemsCurrState] = useState([]);
|
||||
const resetLLMItemsProgress = useCallback(() => {
|
||||
setLLMItems(llmItemsCurrState.map(item => {
|
||||
item.progress = undefined;
|
||||
return item;
|
||||
}));
|
||||
}, [llmItemsCurrState]);
|
||||
const ensureLLMItemsErrorProgress = useCallback((llm_keys_w_errors) => {
|
||||
setLLMItems(llmItemsCurrState.map(item => {
|
||||
if (llm_keys_w_errors.includes(item.key)) {
|
||||
if (!item.progress)
|
||||
item.progress = { success: 0, error: 100 };
|
||||
else {
|
||||
const succ_perc = item.progress.success;
|
||||
item.progress = { success: succ_perc, error: 100 - succ_perc };
|
||||
}
|
||||
} else {
|
||||
if (item.progress && item.progress.success === 0)
|
||||
item.progress = undefined;
|
||||
}
|
||||
|
||||
return item;
|
||||
}));
|
||||
}, [llmItemsCurrState]);
|
||||
|
||||
const getLLMListItemForKey = useCallback((key) => {
|
||||
return llmItemsCurrState.find((item) => item.key === key);
|
||||
}, [llmItemsCurrState]);
|
||||
|
||||
// Progress when querying responses
|
||||
const [progress, setProgress] = useState(undefined);
|
||||
const [progressAnimated, setProgressAnimated] = useState(true);
|
||||
@ -155,9 +118,9 @@ const PromptNode = ({ data, id }) => {
|
||||
|
||||
const triggerAlert = useCallback((msg) => {
|
||||
setProgress(undefined);
|
||||
resetLLMItemsProgress();
|
||||
llmListContainer?.current?.resetLLMItemsProgress();
|
||||
alertModal.current.trigger(msg);
|
||||
}, [resetLLMItemsProgress, alertModal]);
|
||||
}, [llmListContainer, alertModal]);
|
||||
|
||||
const showResponseInspector = useCallback(() => {
|
||||
if (inspectModal && inspectModal.current && jsonResponses)
|
||||
@ -170,45 +133,26 @@ const PromptNode = ({ data, id }) => {
|
||||
setStatus('warning');
|
||||
}, [promptTextOnLastRun, status])
|
||||
|
||||
const addModel = useCallback((model) => {
|
||||
// Get the item for that model
|
||||
let item = AvailableLLMs.find(llm => llm.base_model === model);
|
||||
const addModel = useCallback((new_model, all_items) => {
|
||||
setLLMItemsCurrState(all_items);
|
||||
setDataPropsForNode(id, { llms: all_items });
|
||||
signalDirty();
|
||||
}, [signalDirty]);
|
||||
|
||||
if (!item) { // This should never trigger, but in case it does:
|
||||
triggerAlert(`Could not find model named '${model}' in list of available LLMs.`);
|
||||
return;
|
||||
}
|
||||
|
||||
// Give it a uid as a unique key (this is needed for the draggable list to support multiple same-model items; keys must be unique)
|
||||
item = {key: uuid(), ...item};
|
||||
|
||||
// Generate the default settings for this model
|
||||
item.settings = getDefaultModelSettings(model);
|
||||
|
||||
// Repair names to ensure they are unique
|
||||
const unique_name = ensureUniqueName(item.name, llmItemsCurrState.map(i => i.name));
|
||||
item.name = unique_name;
|
||||
item.formData = { 'shortname': unique_name };
|
||||
|
||||
// Add model to LLM list (regardless of it's present already or not).
|
||||
setLLMItems(llmItemsCurrState.concat([item]))
|
||||
signalDirty();
|
||||
}, [llmItemsCurrState, signalDirty]);
|
||||
|
||||
const onLLMListItemsChange = useCallback((new_items) => {
|
||||
const onLLMListItemsChange = useCallback((new_items, old_items) => {
|
||||
setLLMItemsCurrState(new_items);
|
||||
setDataPropsForNode(id, { llms: new_items });
|
||||
|
||||
// If there's been any change to the item list, signal dirty:
|
||||
if (new_items.length !== llmItemsCurrState.length || !new_items.every(i => llmItemsCurrState.some(s => s.key === i.key))) {
|
||||
if (new_items.length !== old_items.length || !new_items.every(i => old_items.some(s => s.key === i.key))) {
|
||||
signalDirty();
|
||||
} else if (!new_items.every(itemA => {
|
||||
const itemB = llmItemsCurrState.find(b => b.key === itemA.key);
|
||||
const itemB = old_items.find(b => b.key === itemA.key);
|
||||
return JSON.stringify(itemA.settings) === JSON.stringify(itemB.settings);
|
||||
})) {
|
||||
signalDirty();
|
||||
}
|
||||
}, [setLLMItemsCurrState, signalDirty]);
|
||||
}, [setDataPropsForNode, signalDirty]);
|
||||
|
||||
const refreshTemplateHooks = (text) => {
|
||||
// Update template var fields + handles
|
||||
@ -258,7 +202,7 @@ const PromptNode = ({ data, id }) => {
|
||||
|
||||
// Pull all inputs needed to request responses.
|
||||
// Returns [prompt, vars dict]
|
||||
const pullInputData = () => {
|
||||
const pullInputData = (_targetHandles) => {
|
||||
// Pull data from each source recursively:
|
||||
const pulled_data = {};
|
||||
const store_data = (_texts, _varname, _data) => {
|
||||
@ -293,18 +237,61 @@ const PromptNode = ({ data, id }) => {
|
||||
});
|
||||
});
|
||||
};
|
||||
get_outputs(templateVars, id);
|
||||
get_outputs(_targetHandles, id);
|
||||
|
||||
return [promptText, pulled_data];
|
||||
return pulled_data;
|
||||
};
|
||||
|
||||
// Chat nodes only. Pulls input data attached to the 'past conversations' handle.
|
||||
// Returns a tuple (past_chat_llms, __past_chats), where both are undefined if nothing is connected.
|
||||
const pullInputChats = () => {
|
||||
const pulled_data = pullInputData(['__past_chats']);
|
||||
if (!('__past_chats' in pulled_data)) return [undefined, undefined];
|
||||
|
||||
// For storing the unique LLMs in past_chats:
|
||||
let llm_names = new Set();
|
||||
let past_chat_llms = [];
|
||||
|
||||
// We need to calculate the conversation history from the pulled responses.
|
||||
// Note that TemplateVarInfo might have a 'chat_history' component, but this does not
|
||||
// include the most recent prompt and response --for that, we need to use the 'prompt' and 'text' items.
|
||||
// We need to create a revised chat history that concatenates the past history with the last AI + human turns:
|
||||
const past_chats = pulled_data['__past_chats'].map(info => {
|
||||
// Add to unique LLMs list, if necessary
|
||||
const llm_name = info?.llm?.name;
|
||||
if (llm_name !== undefined && !llm_names.has(llm_name)) {
|
||||
llm_names.add(llm_name);
|
||||
past_chat_llms.push(info.llm);
|
||||
}
|
||||
|
||||
// Create revised chat_history on the TemplateVarInfo object,
|
||||
// with the prompt and text of the pulled data as the 2nd-to-last, and last, messages:
|
||||
let last_messages = [
|
||||
{ role: 'user', content: info.prompt },
|
||||
{ role: 'assistant', content: info.text }
|
||||
];
|
||||
let updated_chat_hist = info.chat_history !== undefined ? info.chat_history.concat(last_messages) : last_messages;
|
||||
|
||||
// Append any present system message retroactively as the first message in the chat history:
|
||||
if (info?.llm?.settings?.system_msg !== undefined && updated_chat_hist[0].role !== 'system')
|
||||
updated_chat_hist = [{ role: 'system', content: info.llm.settings.system_msg }].concat(updated_chat_hist);
|
||||
|
||||
// ChatHistoryInfo format (see typing.ts)
|
||||
return {messages: updated_chat_hist, fill_history: info.fill_history, metavars: info.metavars, llm: llm_name};
|
||||
});
|
||||
|
||||
// Returns [list of LLM specs, list of ChatHistoryInfo]
|
||||
return [past_chat_llms, past_chats];
|
||||
};
|
||||
|
||||
// Ask the backend how many responses it needs to collect, given the input data:
|
||||
const fetchResponseCounts = (prompt, vars, llms, rejected) => {
|
||||
const fetchResponseCounts = (prompt, vars, llms, chat_histories, rejected) => {
|
||||
return fetch_from_backend('countQueriesRequired', {
|
||||
prompt: prompt,
|
||||
vars: vars,
|
||||
llms: llms,
|
||||
id: id,
|
||||
chat_histories: chat_histories,
|
||||
n: numGenerations,
|
||||
}, rejected).then(function(json) {
|
||||
if (!json || !json.counts) {
|
||||
@ -318,36 +305,52 @@ const PromptNode = ({ data, id }) => {
|
||||
const [promptPreviews, setPromptPreviews] = useState([]);
|
||||
const handlePreviewHover = () => {
|
||||
// Pull input data and prompt
|
||||
const [root_prompt, pulled_vars] = pullInputData();
|
||||
const pulled_vars = pullInputData(templateVars);
|
||||
fetch_from_backend('generatePrompts', {
|
||||
prompt: root_prompt,
|
||||
prompt: promptText,
|
||||
vars: pulled_vars,
|
||||
}).then(prompts => {
|
||||
setPromptPreviews(prompts.map(p => (new PromptInfo(p))));
|
||||
});
|
||||
|
||||
pullInputChats();
|
||||
};
|
||||
|
||||
// On hover over the 'Run' button, request how many responses are required and update the tooltip. Soft fails.
|
||||
const handleRunHover = () => {
|
||||
// Check if there's at least one model in the list; if not, nothing to run on.
|
||||
if (!llmItemsCurrState || llmItemsCurrState.length == 0) {
|
||||
setRunTooltip('No LLMs to query.');
|
||||
return;
|
||||
}
|
||||
|
||||
// Check if the PromptNode is not already waiting for a response...
|
||||
if (status === 'loading') {
|
||||
setRunTooltip('Fetching responses...');
|
||||
return;
|
||||
}
|
||||
|
||||
// Get input data and prompt
|
||||
const [root_prompt, pulled_vars] = pullInputData();
|
||||
const llms = llmItemsCurrState.map(item => item.model);
|
||||
let _llmItemsCurrState = llmItemsCurrState;
|
||||
|
||||
// If this is a chat node, we also need to pull chat histories:
|
||||
let [past_chat_llms, pulled_chats] = node_type === 'chat' ? pullInputChats() : [undefined, undefined];
|
||||
if (node_type === 'chat' && contChatWithPriorLLMs) {
|
||||
if (past_chat_llms === undefined) {
|
||||
setRunTooltip('Attach an input to past conversations first.');
|
||||
return;
|
||||
}
|
||||
_llmItemsCurrState = past_chat_llms;
|
||||
pulled_chats = bucketChatHistoryInfosByLLM(pulled_chats);
|
||||
}
|
||||
|
||||
// Check if there's at least one model in the list; if not, nothing to run on.
|
||||
if (!_llmItemsCurrState || _llmItemsCurrState.length == 0) {
|
||||
setRunTooltip('No LLMs to query.');
|
||||
return;
|
||||
}
|
||||
|
||||
// Pull the input data
|
||||
const pulled_vars = pullInputData(templateVars);
|
||||
|
||||
const llms = _llmItemsCurrState.map(item => item.model);
|
||||
const num_llms = llms.length;
|
||||
|
||||
// Fetch response counts from backend
|
||||
fetchResponseCounts(root_prompt, pulled_vars, llmItemsCurrState, (err) => {
|
||||
fetchResponseCounts(promptText, pulled_vars, _llmItemsCurrState, pulled_chats, (err) => {
|
||||
console.warn(err.message); // soft fail
|
||||
}).then(([counts, total_num_responses]) => {
|
||||
// Check for empty counts (means no requests will be sent!)
|
||||
@ -383,10 +386,12 @@ const PromptNode = ({ data, id }) => {
|
||||
}
|
||||
} else {
|
||||
const llm_key = Object.keys(queries_per_llm)[0];
|
||||
const llm_name = getLLMListItemForKey(llm_key)?.name;
|
||||
const llm_name = llmListContainer?.current?.getLLMListItemForKey(llm_key)?.name;
|
||||
const llm_count = queries_per_llm[llm_key];
|
||||
const req = llm_count > 1 ? 'queries' : 'query';
|
||||
if (num_llms > num_llms_missing)
|
||||
if (llm_name === undefined)
|
||||
setRunTooltip(`Will send ${llm_count} ${req} per LLM`);
|
||||
else if (num_llms > num_llms_missing)
|
||||
setRunTooltip(`Will send ${llm_count} ${req} to ${llm_name} and load others`);
|
||||
else
|
||||
setRunTooltip(`Will send ${llm_count} ${req} to ${llm_name}`);
|
||||
@ -404,48 +409,64 @@ const PromptNode = ({ data, id }) => {
|
||||
});
|
||||
|
||||
if (!is_fully_connected) {
|
||||
console.log('Not connected! :(', templateVars, edges);
|
||||
triggerAlert('Missing inputs to one or more template variables.');
|
||||
return;
|
||||
}
|
||||
|
||||
console.log('Connected!');
|
||||
// If this is a chat node, we need to pull chat histories:
|
||||
let [past_chat_llms, pulled_chats] = node_type === 'chat' ? pullInputChats() : [undefined, undefined];
|
||||
|
||||
// If this is a chat node and 'continuing chat with prior LLMs' is checked,
|
||||
// there's no customizable model list (llmItemsCurrState). Instead, we need to get the unique
|
||||
// LLMs present by finding the set of 'llm' key with unique 'name' properties
|
||||
// in the input variables (if any). If there's keys present w/o LLMs (for instance a text node),
|
||||
// we need to pop-up an error message.
|
||||
let _llmItemsCurrState = llmItemsCurrState;
|
||||
if (node_type === 'chat' && contChatWithPriorLLMs) {
|
||||
// If there's nothing attached to past conversations, we can't continue the chat:
|
||||
if (past_chat_llms === undefined) {
|
||||
triggerAlert('You need to attach an input to the Past Conversation message first. For instance, you might query \
|
||||
multiple chat LLMs with a prompt node, and then attach the Prompt Node output to the \
|
||||
Past Conversation input of this Chat Turn node in order to continue the chat.');
|
||||
return;
|
||||
}
|
||||
|
||||
// Override LLM list with the past llm info (unique LLMs in prior responses)
|
||||
_llmItemsCurrState = past_chat_llms;
|
||||
|
||||
// Now we need transform the 'pulled_chats' to be a dict indexed by LLM nicknames:
|
||||
pulled_chats = bucketChatHistoryInfosByLLM(pulled_chats);
|
||||
}
|
||||
|
||||
// Check that there is at least one LLM selected:
|
||||
if (llmItemsCurrState.length === 0) {
|
||||
if (_llmItemsCurrState.length === 0) {
|
||||
alert('Please select at least one LLM to prompt.')
|
||||
return;
|
||||
}
|
||||
|
||||
// Set status indicator
|
||||
setStatus('loading');
|
||||
setContChatToggleDisabled(true);
|
||||
setJSONResponses([]);
|
||||
setProgressAnimated(true);
|
||||
|
||||
const [prompt_template, pulled_data] = pullInputData();
|
||||
// Pull the data to fill in template input variables, if any
|
||||
const pulled_data = pullInputData(templateVars);
|
||||
const prompt_template = promptText;
|
||||
|
||||
let FINISHED_QUERY = false;
|
||||
const rejected = (err) => {
|
||||
setStatus('error');
|
||||
setContChatToggleDisabled(false);
|
||||
triggerAlert(err.message);
|
||||
FINISHED_QUERY = true;
|
||||
};
|
||||
|
||||
// Ask the backend to reset the scratchpad for counting queries:
|
||||
const create_progress_scratchpad = () => {
|
||||
return fetch_from_backend('createProgressFile', {id: id}, rejected);
|
||||
};
|
||||
|
||||
// Fetch info about the number of queries we'll need to make
|
||||
const fetch_resp_count = () => fetchResponseCounts(
|
||||
prompt_template, pulled_data, llmItemsCurrState, rejected);
|
||||
prompt_template, pulled_data, _llmItemsCurrState, pulled_chats, rejected);
|
||||
|
||||
// Initialize progress bars to small amounts
|
||||
setProgress({ success: 2, error: 0 });
|
||||
setLLMItems(llmItemsCurrState.map(item => {
|
||||
item.progress = { success: 0, error: 0 };
|
||||
return item;
|
||||
}));
|
||||
llmListContainer?.current?.setZeroPercProgress();
|
||||
|
||||
// Create a callback to listen for progress
|
||||
let onProgressChange = () => {};
|
||||
@ -456,9 +477,9 @@ const PromptNode = ({ data, id }) => {
|
||||
if (!progress_by_llm_key) return;
|
||||
|
||||
// Update individual progress bars
|
||||
const num_llms = llmItemsCurrState.length;
|
||||
const num_llms = _llmItemsCurrState.length;
|
||||
const num_resp_per_llm = (max_responses / num_llms);
|
||||
setLLMItems(llmItemsCurrState.map(item => {
|
||||
llmListContainer?.current?.updateProgress(item => {
|
||||
if (item.key in progress_by_llm_key) {
|
||||
item.progress = {
|
||||
success: progress_by_llm_key[item.key]['success'] / num_resp_per_llm * 100,
|
||||
@ -466,7 +487,7 @@ const PromptNode = ({ data, id }) => {
|
||||
}
|
||||
}
|
||||
return item;
|
||||
}));
|
||||
});
|
||||
|
||||
// Update total progress bar
|
||||
const total_num_success = Object.keys(progress_by_llm_key).reduce((acc, llm_key) => {
|
||||
@ -475,6 +496,7 @@ const PromptNode = ({ data, id }) => {
|
||||
const total_num_error = Object.keys(progress_by_llm_key).reduce((acc, llm_key) => {
|
||||
return acc + progress_by_llm_key[llm_key]['error'];
|
||||
}, 0);
|
||||
|
||||
setProgress({
|
||||
success: Math.max(5, total_num_success / max_responses * 100),
|
||||
error: total_num_error / max_responses * 100 }
|
||||
@ -486,20 +508,19 @@ const PromptNode = ({ data, id }) => {
|
||||
const query_llms = () => {
|
||||
return fetch_from_backend('queryllm', {
|
||||
id: id,
|
||||
llm: llmItemsCurrState, // deep clone it first
|
||||
llm: _llmItemsCurrState, // deep clone it first
|
||||
prompt: prompt_template,
|
||||
vars: pulled_data,
|
||||
chat_histories: pulled_chats,
|
||||
n: numGenerations,
|
||||
api_keys: (apiKeys ? apiKeys : {}),
|
||||
no_cache: false,
|
||||
progress_listener: onProgressChange,
|
||||
}, rejected).then(function(json) {
|
||||
if (!json) {
|
||||
setStatus('error');
|
||||
triggerAlert('Request was sent and received by backend server, but there was no response.');
|
||||
rejected('Request was sent and received by backend server, but there was no response.');
|
||||
}
|
||||
else if (json.responses && json.errors) {
|
||||
FINISHED_QUERY = true;
|
||||
|
||||
// Store and log responses (if any)
|
||||
if (json.responses) {
|
||||
@ -517,16 +538,17 @@ const PromptNode = ({ data, id }) => {
|
||||
|
||||
// Ensure there's a sliver of error displayed in the progress bar
|
||||
// of every LLM item that has an error:
|
||||
ensureLLMItemsErrorProgress(llms_w_errors);
|
||||
llmListContainer?.current?.ensureLLMItemsErrorProgress(llms_w_errors);
|
||||
|
||||
// Set error status
|
||||
setStatus('error');
|
||||
setContChatToggleDisabled(false);
|
||||
|
||||
// Trigger alert and display one error message per LLM of all collected errors:
|
||||
let combined_err_msg = "";
|
||||
llms_w_errors.forEach(llm_key => {
|
||||
const item = getLLMListItemForKey(llm_key);
|
||||
combined_err_msg += item.name + ': ' + JSON.stringify(json.errors[llm_key][0]) + '\n';
|
||||
const item = _llmItemsCurrState.find((item) => item.key === llm_key);
|
||||
combined_err_msg += item?.name + ': ' + JSON.stringify(json.errors[llm_key][0]) + '\n';
|
||||
});
|
||||
// We trigger the alert directly (don't use triggerAlert) here because we want to keep the progress bar:
|
||||
alertModal.current.trigger('Errors collecting responses. Re-run prompt node to retry.\n\n'+combined_err_msg);
|
||||
@ -536,29 +558,37 @@ const PromptNode = ({ data, id }) => {
|
||||
|
||||
// All responses collected! Change status to 'ready':
|
||||
setStatus('ready');
|
||||
setContChatToggleDisabled(false);
|
||||
|
||||
// Remove progress bars
|
||||
setProgress(undefined);
|
||||
setProgressAnimated(true);
|
||||
resetLLMItemsProgress();
|
||||
llmListContainer?.current?.resetLLMItemsProgress();
|
||||
|
||||
// Save prompt text so we remember what prompt we have responses cache'd for:
|
||||
setPromptTextOnLastRun(promptText);
|
||||
setNumGenerationsLastRun(numGenerations);
|
||||
|
||||
// Save response texts as 'fields' of data, for any prompt nodes pulling the outputs
|
||||
// First we need to get a unique key for a unique metavar for the LLM set that produced these responses,
|
||||
// We also need to store a unique metavar for the LLM *set* (set of LLM nicknames) that produced these responses,
|
||||
// so we can keep track of 'upstream' LLMs (and plot against them) later on:
|
||||
const llm_metavar_key = getUniqueLLMMetavarKey(json.responses);
|
||||
setDataPropsForNode(id, {fields: json.responses.map(
|
||||
resp_obj => resp_obj['responses'].map(
|
||||
r => {
|
||||
// Carry over the response text and prompt fill history (vars):
|
||||
let o = {text: escapeBraces(r), fill_history: resp_obj['vars']};
|
||||
// Carry over the response text, prompt, prompt fill history (vars), and llm nickname:
|
||||
let o = { text: escapeBraces(r),
|
||||
prompt: resp_obj['prompt'],
|
||||
fill_history: resp_obj['vars'],
|
||||
llm: _llmItemsCurrState.find((item) => item.name === resp_obj.llm) };
|
||||
|
||||
// Carry over any metavars
|
||||
o.metavars = resp_obj['metavars'] || {};
|
||||
|
||||
// Carry over any chat history
|
||||
if (resp_obj['chat_history'])
|
||||
o.chat_history = resp_obj['chat_history'];
|
||||
|
||||
// Add a meta var to keep track of which LLM produced this response
|
||||
o.metavars[llm_metavar_key] = resp_obj['llm'];
|
||||
return o;
|
||||
@ -569,15 +599,13 @@ const PromptNode = ({ data, id }) => {
|
||||
// Ping any inspect nodes attached to this node to refresh their contents:
|
||||
pingOutputNodes(id);
|
||||
} else {
|
||||
setStatus('error');
|
||||
triggerAlert(json.error || 'Unknown error when querying LLM');
|
||||
rejected(json.error || 'Unknown error when querying LLM');
|
||||
}
|
||||
}, rejected);
|
||||
};
|
||||
|
||||
// Now put it all together!
|
||||
create_progress_scratchpad()
|
||||
.then(fetch_resp_count)
|
||||
fetch_resp_count()
|
||||
.then(open_progress_listener)
|
||||
.then(query_llms)
|
||||
.catch(rejected);
|
||||
@ -608,9 +636,10 @@ const PromptNode = ({ data, id }) => {
|
||||
// NOTE: This won't work on older browsers, but there's no alternative solution.
|
||||
if (!textAreaRef.current && elem && window.ResizeObserver) {
|
||||
let past_hooks_y = 138;
|
||||
const incr = 68 + (node_type === 'chat' ? -6 : 0);
|
||||
const observer = new ResizeObserver(() => {
|
||||
if (!textAreaRef || !textAreaRef.current) return;
|
||||
const new_hooks_y = textAreaRef.current.clientHeight + 68;
|
||||
const new_hooks_y = textAreaRef.current.clientHeight + incr;
|
||||
if (past_hooks_y !== new_hooks_y) {
|
||||
setHooksY(new_hooks_y);
|
||||
past_hooks_y = new_hooks_y;
|
||||
@ -624,31 +653,52 @@ const PromptNode = ({ data, id }) => {
|
||||
|
||||
return (
|
||||
<div className="prompt-node cfnode">
|
||||
<NodeLabel title={data.title || 'Prompt Node'}
|
||||
<NodeLabel title={data.title || node_default_title}
|
||||
nodeId={id}
|
||||
onEdit={hideStatusIndicator}
|
||||
icon={'💬'}
|
||||
icon={node_icon}
|
||||
status={status}
|
||||
alertModal={alertModal}
|
||||
handleRunClick={handleRunClick}
|
||||
handleRunHover={handleRunHover}
|
||||
runButtonTooltip={runTooltip}
|
||||
customButtons={[
|
||||
<PromptListPopover promptInfos={promptPreviews} onHover={handlePreviewHover} onClick={openInfoModal} />
|
||||
<PromptListPopover key='prompt-previews' promptInfos={promptPreviews} onHover={handlePreviewHover} onClick={openInfoModal} />
|
||||
]} />
|
||||
<LLMResponseInspectorModal ref={inspectModal} jsonResponses={jsonResponses} prompt={promptText} />
|
||||
<Modal title={'List of prompts that will be sent to LLMs (' + promptPreviews.length + ' total)'} size='xl' opened={infoModalOpened} onClose={closeInfoModal} styles={{header: {backgroundColor: '#FFD700'}, root: {position: 'relative', left: '-80px'}}}>
|
||||
<Modal title={'List of prompts that will be sent to LLMs (' + promptPreviews.length + ' total)'} size='xl' opened={infoModalOpened} onClose={closeInfoModal} styles={{header: {backgroundColor: '#FFD700'}, root: {position: 'relative', left: '-5%'}}}>
|
||||
<Box size={600} m='lg' mt='xl'>
|
||||
{displayPromptInfos(promptPreviews)}
|
||||
</Box>
|
||||
</Modal>
|
||||
<Textarea ref={setRef}
|
||||
|
||||
{ node_type === 'chat' ? (<div ref={setRef}>
|
||||
<ChatHistoryView bgColors={['#ccc', '#ceeaf5b1']} messages={[
|
||||
"(Past conversation)",
|
||||
<Textarea
|
||||
className="prompt-field-fixed nodrag nowheel"
|
||||
minRows="4"
|
||||
defaultValue={data.prompt}
|
||||
onChange={handleInputChange}
|
||||
miw={230}
|
||||
styles={{input: {background: 'transparent', borderWidth: '0px'}}} />
|
||||
]} />
|
||||
<Handle
|
||||
type="target"
|
||||
position="left"
|
||||
id="__past_chats"
|
||||
style={{ top: '82px', background: '#555' }}
|
||||
/>
|
||||
</div>) : (
|
||||
<Textarea ref={setRef}
|
||||
autosize
|
||||
className="prompt-field-fixed nodrag nowheel"
|
||||
minRows="4"
|
||||
maxRows="12"
|
||||
defaultValue={data.prompt}
|
||||
onChange={handleInputChange} />
|
||||
onChange={handleInputChange} />)
|
||||
}
|
||||
|
||||
<Handle
|
||||
type="source"
|
||||
position="right"
|
||||
@ -656,36 +706,41 @@ const PromptNode = ({ data, id }) => {
|
||||
className="grouped-handle"
|
||||
style={{ top: '50%' }}
|
||||
/>
|
||||
<TemplateHooks vars={templateVars} nodeId={id} startY={hooksY} />
|
||||
<hr />
|
||||
<div>
|
||||
<TemplateHooks vars={templateVars} nodeId={id} startY={hooksY} ignoreHandles={['__past_chats']} />
|
||||
<hr />
|
||||
<div>
|
||||
<div style={{marginBottom: '10px', padding: '4px'}}>
|
||||
<label htmlFor="num-generations" style={{fontSize: '10pt'}}>Num responses per prompt: </label>
|
||||
<input id="num-generations" name="num-generations" type="number" min={1} max={50} defaultValue={data.n || 1} onChange={handleNumGenChange} className="nodrag"></input>
|
||||
</div>
|
||||
<div id="llms-list" className="nowheel" style={{backgroundColor: '#eee', borderRadius: '4px', padding: '8px', overflowY: 'auto', maxHeight: '175px'}}>
|
||||
<div style={{marginTop: '6px', marginBottom: '6px', marginLeft: '6px', paddingBottom: '4px', textAlign: 'left', fontSize: '10pt', color: '#777'}}>
|
||||
Models to query:
|
||||
<div className="add-llm-model-btn nodrag">
|
||||
<Menu transitionProps={{ transition: 'pop-top-left' }}
|
||||
position="bottom-start"
|
||||
width={220}
|
||||
withinPortal={true}
|
||||
>
|
||||
<Menu.Target>
|
||||
<button>Add +</button>
|
||||
</Menu.Target>
|
||||
<Menu.Dropdown>
|
||||
{AvailableLLMs.map(item => (<Menu.Item key={item.model} onClick={() => addModel(item.base_model)} icon={item.emoji}>{item.name}</Menu.Item>))}
|
||||
</Menu.Dropdown>
|
||||
</Menu>
|
||||
</div>
|
||||
|
||||
{node_type === 'chat' ? (
|
||||
<div>
|
||||
<Switch
|
||||
label={contChatWithPriorLLMs ? "Continue chat with prior LLM(s)" : "Continue chat with new LLMs:"}
|
||||
defaultChecked={true}
|
||||
checked={contChatWithPriorLLMs}
|
||||
disabled={contChatToggleDisabled}
|
||||
onChange={(event) => {
|
||||
setStatus('warning');
|
||||
setContChatWithPriorLLMs(event.currentTarget.checked);
|
||||
setDataPropsForNode(id, { contChat: event.currentTarget.checked });
|
||||
}}
|
||||
color='cyan'
|
||||
size='xs'
|
||||
mb={contChatWithPriorLLMs ? '4px' : '10px'}
|
||||
/>
|
||||
</div>
|
||||
|
||||
<div className="nodrag">
|
||||
<LLMList llms={llmItems} onItemsChange={onLLMListItemsChange} />
|
||||
</div>
|
||||
</div>
|
||||
) : <></>}
|
||||
|
||||
{node_type !== 'chat' || !contChatWithPriorLLMs ? (
|
||||
<LLMListContainer
|
||||
ref={llmListContainer}
|
||||
initLLMItems={data.llms}
|
||||
onAddModel={addModel}
|
||||
onItemsChange={onLLMListItemsChange} />
|
||||
) : <></>}
|
||||
|
||||
{progress !== undefined ?
|
||||
(<Progress animate={progressAnimated} sections={[
|
||||
{ value: progress.success, color: 'blue', tooltip: 'API call succeeded' },
|
||||
@ -698,9 +753,9 @@ const PromptNode = ({ data, id }) => {
|
||||
<Button color='blue' variant='subtle' w='100%' >Inspect responses <IconSearch size='12pt'/></Button>
|
||||
</div>) : <></>
|
||||
}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
);
|
||||
};
|
||||
|
||||
export default PromptNode;
|
@ -31,35 +31,8 @@ export const extractBracketedSubstrings = (text) => {
|
||||
return capture_groups;
|
||||
};
|
||||
|
||||
export const toPyTemplateFormat = (text) => {
|
||||
/** Given some text in template format:
|
||||
* This is a {test}
|
||||
* adds a $ before each valid {.
|
||||
*
|
||||
* NOTE: We don't use Regex here for compatibility of browsers
|
||||
* that don't support negative lookbehinds/aheads (e.g., Safari).
|
||||
*/
|
||||
let str = text.slice(0);
|
||||
let prev_c = '';
|
||||
let group_start_idx = -1;
|
||||
for (let i = 0; i < str.length; i += 1) {
|
||||
let c = str[i];
|
||||
if (prev_c !== '\\') { // Skipped escaped chars
|
||||
if (group_start_idx === -1 && c === '{') {
|
||||
// Insert a $ before the {:
|
||||
str = str.slice(0, i) + '$' + str.slice(i);
|
||||
group_start_idx = i + 1;
|
||||
i += 1;
|
||||
} else if (group_start_idx > -1 && c === '}') {
|
||||
group_start_idx = -1;
|
||||
}
|
||||
}
|
||||
prev_c = c;
|
||||
}
|
||||
return str;
|
||||
};
|
||||
|
||||
export default function TemplateHooks({ vars, nodeId, startY, position }) {
|
||||
export default function TemplateHooks({ vars, nodeId, startY, position, ignoreHandles }) {
|
||||
|
||||
const edges = useStore((state) => state.edges);
|
||||
const onEdgesChange = useStore((state) => state.onEdgesChange);
|
||||
@ -102,7 +75,8 @@ export default function TemplateHooks({ vars, nodeId, startY, position }) {
|
||||
if (templateHooks.length > 0) {
|
||||
let deleted_edges = [];
|
||||
edges.forEach(e => {
|
||||
if (e.target !== nodeId || vars.includes(e.targetHandle))
|
||||
if (e.target !== nodeId || vars.includes(e.targetHandle) ||
|
||||
(Array.isArray(ignoreHandles) && ignoreHandles.includes(e.targetHandle)))
|
||||
return;
|
||||
else {
|
||||
deleted_edges.push(e);
|
||||
@ -116,7 +90,7 @@ export default function TemplateHooks({ vars, nodeId, startY, position }) {
|
||||
setTemplateHooks(genTemplateHooks(vars, []));
|
||||
|
||||
// setDataPropsForNode(nodeId, {vars: vars});
|
||||
}, [vars, startY, genTemplateHooks, nodeId]);
|
||||
}, [vars, startY, genTemplateHooks, nodeId, ignoreHandles]);
|
||||
|
||||
// Because of the way React flow internally stores Handles,
|
||||
// we need to notify it to update its backend representation of the 'node'
|
||||
|
@ -46,7 +46,7 @@ test('call three LLMs with a single prompt', async () => {
|
||||
};
|
||||
|
||||
// Call all three LLMs with the same prompt, n=1, and listen to progress
|
||||
const {responses, errors} = await queryLLM('testid', llms, n, prompt, {}, undefined, undefined, progress_listener);
|
||||
const {responses, errors} = await queryLLM('testid', llms, n, prompt, {}, undefined, undefined, undefined, progress_listener);
|
||||
|
||||
// Check responses
|
||||
expect(responses).toHaveLength(3);
|
||||
|
@ -17,7 +17,8 @@ async function prompt_model(model: LLM): Promise<void> {
|
||||
// Double-check the cache'd results
|
||||
let cache = pipeline._load_cached_responses();
|
||||
Object.entries(cache).forEach(([prompt, response]) => {
|
||||
console.log(`Prompt: ${prompt}\nResponse: ${response.responses[0]}`);
|
||||
const r = Array.isArray(response) ? response[0] : response;
|
||||
console.log(`Prompt: ${prompt}\nResponse: ${r.responses[0]}`);
|
||||
});
|
||||
expect(Object.keys(cache)).toHaveLength(3); // expect 3 prompts
|
||||
|
||||
@ -34,7 +35,8 @@ async function prompt_model(model: LLM): Promise<void> {
|
||||
|
||||
// Double-check the cache'd results
|
||||
cache = pipeline._load_cached_responses();
|
||||
Object.entries(cache).forEach(([prompt, resp_obj]) => {
|
||||
Object.entries(cache).forEach(([prompt, response]) => {
|
||||
const resp_obj = Array.isArray(response) ? response[0] : response;
|
||||
console.log(`Prompt: ${prompt}\nResponses: ${JSON.stringify(resp_obj.responses)}`);
|
||||
expect(resp_obj.responses).toHaveLength(2);
|
||||
expect(resp_obj.raw_response).toHaveLength(2); // these should've been merged
|
||||
@ -53,7 +55,8 @@ async function prompt_model(model: LLM): Promise<void> {
|
||||
});
|
||||
|
||||
cache = pipeline._load_cached_responses();
|
||||
Object.entries(cache).forEach(([prompt, resp_obj]) => {
|
||||
Object.entries(cache).forEach(([prompt, response]) => {
|
||||
const resp_obj = Array.isArray(response) ? response[0] : response;
|
||||
expect(resp_obj.responses).toHaveLength(2);
|
||||
expect(resp_obj.raw_response).toHaveLength(2); // these should've been merged
|
||||
});
|
||||
|
@ -1,7 +1,6 @@
|
||||
import { mean as __mean, std as __std, median as __median } from "mathjs";
|
||||
import markdownIt from "markdown-it";
|
||||
|
||||
import { Dict, LLMResponseError, LLMResponseObject, StandardizedLLMResponse } from "./typing";
|
||||
import { Dict, StringDict, LLMResponseError, LLMResponseObject, StandardizedLLMResponse, ChatHistory, ChatHistoryInfo, isEqualChatHistory } from "./typing";
|
||||
import { LLM, getEnumName } from "./models";
|
||||
import { APP_IS_RUNNING_LOCALLY, set_api_keys, FLASK_BASE_URL, call_flask_backend } from "./utils";
|
||||
import StorageCache from "./cache";
|
||||
@ -117,7 +116,7 @@ export class ResponseInfo {
|
||||
}
|
||||
|
||||
function to_standard_format(r: LLMResponseObject | Dict): StandardizedLLMResponse {
|
||||
let resp_obj = {
|
||||
let resp_obj: StandardizedLLMResponse = {
|
||||
vars: r['info'],
|
||||
metavars: r['metavars'] || {},
|
||||
llm: r['llm'],
|
||||
@ -126,7 +125,9 @@ function to_standard_format(r: LLMResponseObject | Dict): StandardizedLLMRespons
|
||||
tokens: r.raw_response?.usage || {},
|
||||
};
|
||||
if ('eval_res' in r)
|
||||
resp_obj['eval_res'] = r['eval_res'];
|
||||
resp_obj.eval_res = r.eval_res;
|
||||
if ('chat_history' in r)
|
||||
resp_obj.chat_history = r.chat_history;
|
||||
return resp_obj;
|
||||
}
|
||||
|
||||
@ -140,6 +141,22 @@ function get_cache_keys_related_to_id(cache_id: string, include_basefile: boolea
|
||||
return include_basefile ? [base_file] : [];
|
||||
}
|
||||
|
||||
async function setAPIKeys(api_keys: StringDict): Promise<void> {
|
||||
if (APP_IS_RUNNING_LOCALLY()) {
|
||||
// Try to fetch API keys from os.environ variables in the locally running Flask backend:
|
||||
try {
|
||||
const api_keys = await fetchEnvironAPIKeys();
|
||||
set_api_keys(api_keys);
|
||||
} catch (err) {
|
||||
console.warn('Warning: Could not fetch API key environment variables from Flask server. Error:', err.message);
|
||||
// Soft fail
|
||||
}
|
||||
}
|
||||
|
||||
if (api_keys !== undefined)
|
||||
set_api_keys(api_keys);
|
||||
}
|
||||
|
||||
// def remove_cached_responses(cache_id: str):
|
||||
// cache_files = get_cache_keys_related_to_id(cache_id)
|
||||
// for filename in cache_files:
|
||||
@ -208,7 +225,10 @@ function extract_llm_params(llm_spec: Dict | string): Dict {
|
||||
function isLooselyEqual(value1: any, value2: any): boolean {
|
||||
// If both values are non-array types, compare them directly
|
||||
if (!Array.isArray(value1) && !Array.isArray(value2)) {
|
||||
return value1 === value2;
|
||||
if (typeof value1 === 'object' && typeof value2 === 'object')
|
||||
return JSON.stringify(value1) === JSON.stringify(value2);
|
||||
else
|
||||
return value1 === value2;
|
||||
}
|
||||
|
||||
// If either value is not an array or their lengths differ, they are not equal
|
||||
@ -237,10 +257,11 @@ function matching_settings(cache_llm_spec: Dict | string, llm_spec: Dict | strin
|
||||
if (typeof llm_spec === 'object' && typeof cache_llm_spec === 'object') {
|
||||
const llm_params = extract_llm_params(llm_spec);
|
||||
const cache_llm_params = extract_llm_params(cache_llm_spec);
|
||||
for (const [param, val] of Object.entries(llm_params))
|
||||
if (param in cache_llm_params && !isLooselyEqual(cache_llm_params[param], val)) {
|
||||
return false;
|
||||
}
|
||||
for (const [param, val] of Object.entries(llm_params)) {
|
||||
if (param in cache_llm_params && !isLooselyEqual(cache_llm_params[param], val)) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
@ -307,7 +328,7 @@ function check_typeof_vals(arr: Array<any>): MetricType {
|
||||
}
|
||||
|
||||
function run_over_responses(eval_func: (resp: ResponseInfo) => any, responses: Array<StandardizedLLMResponse>): Array<StandardizedLLMResponse> {
|
||||
const evald_responses = responses.map((_resp_obj: StandardizedLLMResponse) => {
|
||||
return responses.map((_resp_obj: StandardizedLLMResponse) => {
|
||||
// Deep clone the response object
|
||||
const resp_obj = JSON.parse(JSON.stringify(_resp_obj));
|
||||
|
||||
@ -328,16 +349,12 @@ function run_over_responses(eval_func: (resp: ResponseInfo) => any, responses: A
|
||||
if (eval_res_type === MetricType.Numeric) {
|
||||
// Store items with summary of mean, median, etc
|
||||
resp_obj.eval_res = {
|
||||
mean: __mean(evals),
|
||||
median: __median(evals),
|
||||
stdev: (evals.length > 1 ? __std(evals) : 0),
|
||||
range: [Math.min(...evals), Math.max(...evals)],
|
||||
items: evals,
|
||||
dtype: getEnumName(MetricType, eval_res_type),
|
||||
};
|
||||
} else if ([MetricType.Unknown, MetricType.Empty].includes(eval_res_type))
|
||||
} else if ([MetricType.Unknown, MetricType.Empty].includes(eval_res_type)) {
|
||||
throw new Error('Unsupported types found in evaluation results. Only supported types for metrics are: int, float, bool, str.');
|
||||
else {
|
||||
} else {
|
||||
// Categorical, KeyValue, etc, we just store the items:
|
||||
resp_obj.eval_res = {
|
||||
items: evals,
|
||||
@ -347,8 +364,6 @@ function run_over_responses(eval_func: (resp: ResponseInfo) => any, responses: A
|
||||
|
||||
return resp_obj;
|
||||
});
|
||||
|
||||
return evald_responses;
|
||||
}
|
||||
|
||||
// """ ===================
|
||||
@ -376,11 +391,19 @@ export async function generatePrompts(root_prompt: string, vars: Dict): Promise<
|
||||
* For each var value, can be single values or a list; in the latter, all permutations are passed. (Pass empty dict if no vars.)
|
||||
* @param llms the list of LLMs you will query
|
||||
* @param n how many responses expected per prompt
|
||||
* @param chat_histories (optional) Either an array of `ChatHistory` (to use across all LLMs), or a dict indexed by LLM nicknames of `ChatHistory` arrays to use per LLM.
|
||||
* @param id (optional) a unique ID of the node with cache'd responses. If missing, assumes no cache will be used.
|
||||
* @returns If success, a dict with { counts: <dict of missing queries per LLM>, total_num_responses: <dict of total num responses per LLM> }
|
||||
* If there was an error, returns a dict with a single key, 'error'.
|
||||
*/
|
||||
export async function countQueries(prompt: string, vars: Dict, llms: Array<Dict | string>, n: number, id?: string): Promise<Dict> {
|
||||
export async function countQueries(prompt: string,
|
||||
vars: Dict,
|
||||
llms: Array<Dict | string>,
|
||||
n: number,
|
||||
chat_histories?: ChatHistoryInfo[] | {[key: string]: ChatHistoryInfo[]},
|
||||
id?: string): Promise<Dict> {
|
||||
if (chat_histories === undefined) chat_histories = [ undefined ];
|
||||
|
||||
let gen_prompts: PromptPermutationGenerator;
|
||||
let all_prompt_permutations: Array<PromptTemplate>;
|
||||
try {
|
||||
@ -401,7 +424,10 @@ export async function countQueries(prompt: string, vars: Dict, llms: Array<Dict
|
||||
const add_to_missing_queries = (llm_key: string, prompt: string, num: number) => {
|
||||
if (!(llm_key in missing_queries))
|
||||
missing_queries[llm_key] = {};
|
||||
missing_queries[llm_key][prompt] = num;
|
||||
if (prompt in missing_queries[llm_key])
|
||||
missing_queries[llm_key][prompt] += num;
|
||||
else
|
||||
missing_queries[llm_key][prompt] = num;
|
||||
};
|
||||
const add_to_num_responses_req = (llm_key: string, num: number) => {
|
||||
if (!(llm_key in num_responses_req))
|
||||
@ -412,9 +438,15 @@ export async function countQueries(prompt: string, vars: Dict, llms: Array<Dict
|
||||
llms.forEach(llm_spec => {
|
||||
const llm_key = extract_llm_key(llm_spec);
|
||||
|
||||
// Get the relevant chat histories for this LLM:
|
||||
const chat_hists = (!Array.isArray(chat_histories)
|
||||
? chat_histories[extract_llm_nickname(llm_spec)]
|
||||
: chat_histories) as ChatHistoryInfo[];
|
||||
|
||||
// Find the response cache file for the specific LLM, if any
|
||||
let found_cache = false;
|
||||
for (const [cache_filename, cache_llm_spec] of Object.entries(cache_file_lookup)) {
|
||||
|
||||
if (matching_settings(cache_llm_spec, llm_spec)) {
|
||||
found_cache = true;
|
||||
|
||||
@ -425,15 +457,34 @@ export async function countQueries(prompt: string, vars: Dict, llms: Array<Dict
|
||||
all_prompt_permutations.forEach(perm => {
|
||||
const prompt_str = perm.toString();
|
||||
|
||||
add_to_num_responses_req(llm_key, n);
|
||||
add_to_num_responses_req(llm_key, n * chat_hists.length);
|
||||
|
||||
if (prompt_str in cache_llm_responses) {
|
||||
// Check how many were stored; if not enough, add how many missing queries:
|
||||
const num_resps = cache_llm_responses[prompt_str]['responses'].length;
|
||||
if (n > num_resps)
|
||||
add_to_missing_queries(llm_key, prompt_str, n - num_resps);
|
||||
|
||||
// Get the cache of responses with respect to this prompt, + normalize format so it's always an array (of size >= 0)
|
||||
const cache_bucket = cache_llm_responses[prompt_str];
|
||||
let cached_resps: LLMResponseObject[] = Array.isArray(cache_bucket) ? cache_bucket : (cache_bucket === undefined ? [] : [ cache_bucket ]);
|
||||
|
||||
// For each chat history, find an indivdual response obj that matches it
|
||||
// (chat_hist be undefined, in which case the cache'd response obj must similarly have an undefined chat history in order to match):
|
||||
for (const chat_hist of chat_hists) {
|
||||
let found_resp = false;
|
||||
for (const cached_resp of cached_resps) {
|
||||
if (isEqualChatHistory(cached_resp.chat_history, chat_hist?.messages)) {
|
||||
// Match found. Note it and count response length:
|
||||
found_resp = true;
|
||||
const num_resps = cached_resp.responses.length;
|
||||
if (n > num_resps)
|
||||
add_to_missing_queries(llm_key, prompt_str, n - num_resps);
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!found_resp)
|
||||
add_to_missing_queries(llm_key, prompt_str, n);
|
||||
}
|
||||
} else {
|
||||
add_to_missing_queries(llm_key, prompt_str, n);
|
||||
// There was no cache'd item for this query; add it as missing:
|
||||
add_to_missing_queries(llm_key, prompt_str, n * chat_hists.length);
|
||||
}
|
||||
});
|
||||
|
||||
@ -443,8 +494,8 @@ export async function countQueries(prompt: string, vars: Dict, llms: Array<Dict
|
||||
|
||||
if (!found_cache) {
|
||||
all_prompt_permutations.forEach(perm => {
|
||||
add_to_num_responses_req(llm_key, n);
|
||||
add_to_missing_queries(llm_key, perm.toString(), n);
|
||||
add_to_num_responses_req(llm_key, n * chat_hists.length);
|
||||
add_to_missing_queries(llm_key, perm.toString(), n * chat_hists.length);
|
||||
});
|
||||
}
|
||||
});
|
||||
@ -452,10 +503,6 @@ export async function countQueries(prompt: string, vars: Dict, llms: Array<Dict
|
||||
return {'counts': missing_queries, 'total_num_responses': num_responses_req};
|
||||
}
|
||||
|
||||
export function createProgressFile(id: string): void {
|
||||
// do nothing --this isn't needed for the JS backend, but was for the Python one
|
||||
}
|
||||
|
||||
interface LLMPrompterResults {
|
||||
llm_key: string,
|
||||
responses: Array<LLMResponseObject>,
|
||||
@ -480,6 +527,7 @@ export async function fetchEnvironAPIKeys(): Promise<Dict> {
|
||||
* @param prompt the prompt template, with any {{}} vars
|
||||
* @param vars a dict of the template variables to fill the prompt template with, by name.
|
||||
For each var, can be single values or a list; in the latter, all permutations are passed. (Pass empty dict if no vars.)
|
||||
* @param chat_histories Either an array of `ChatHistory` (to use across all LLMs), or a dict indexed by LLM nicknames of `ChatHistory` arrays to use per LLM.
|
||||
* @param api_keys (optional) a dict of {api_name: api_key} pairs. Supported key names: OpenAI, Anthropic, Google
|
||||
* @param no_cache (optional) if true, deletes any cache'd responses for 'id' (always calls the LLMs fresh)
|
||||
* @returns a dictionary in format `{responses: StandardizedLLMResponse[], errors: string[]}`
|
||||
@ -489,6 +537,7 @@ export async function queryLLM(id: string,
|
||||
n: number,
|
||||
prompt: string,
|
||||
vars: Dict,
|
||||
chat_histories?: ChatHistoryInfo[] | {[key: string]: ChatHistoryInfo[]},
|
||||
api_keys?: Dict,
|
||||
no_cache?: boolean,
|
||||
progress_listener?: (progress: {[key: symbol]: any}) => void): Promise<Dict> {
|
||||
@ -510,19 +559,7 @@ export async function queryLLM(id: string,
|
||||
return {'error': `LLM named '${llm_spec}' is not supported.`};
|
||||
}
|
||||
|
||||
if (APP_IS_RUNNING_LOCALLY()) {
|
||||
// Try to fetch API keys from os.environ variables in the locally running Flask backend:
|
||||
try {
|
||||
const api_keys = await fetchEnvironAPIKeys();
|
||||
set_api_keys(api_keys);
|
||||
} catch (err) {
|
||||
console.warn('Warning: Could not fetch API key environment variables from Flask server. Error:', err.message);
|
||||
// Soft fail
|
||||
}
|
||||
}
|
||||
|
||||
if (api_keys !== undefined)
|
||||
set_api_keys(api_keys);
|
||||
await setAPIKeys(api_keys);
|
||||
|
||||
// if 'no_cache' in data and data['no_cache'] is True:
|
||||
// remove_cached_responses(data['id'])
|
||||
@ -595,6 +632,10 @@ export async function queryLLM(id: string,
|
||||
let llm_key = extract_llm_key(llm_spec);
|
||||
let temperature: number = llm_params?.temperature !== undefined ? llm_params.temperature : 1.0;
|
||||
|
||||
let chat_hists = ((chat_histories !== undefined && !Array.isArray(chat_histories))
|
||||
? chat_histories[llm_nickname]
|
||||
: chat_histories) as ChatHistoryInfo[];
|
||||
|
||||
// Create an object to query the LLM, passing a storage key for cache'ing responses
|
||||
const cache_filepath = llm_to_cache_filename[llm_key];
|
||||
const prompter = new PromptPipeline(prompt, cache_filepath);
|
||||
@ -609,7 +650,8 @@ export async function queryLLM(id: string,
|
||||
console.log(`Querying ${llm_str}...`)
|
||||
|
||||
// Yield responses for 'llm' for each prompt generated from the root template 'prompt' and template variables in 'properties':
|
||||
for await (const response of prompter.gen_responses(vars, llm_str as LLM, num_generations, temperature, llm_params)) {
|
||||
for await (const response of prompter.gen_responses(vars, llm_str as LLM, num_generations, temperature, llm_params, chat_hists)) {
|
||||
|
||||
// Check for selective failure
|
||||
if (response instanceof LLMResponseError) { // The request failed
|
||||
console.error(`error when fetching response from ${llm_str}: ${response.message}`);
|
||||
@ -836,6 +878,104 @@ export async function executepy(id: string,
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Runs an LLM over responses as a grader/evaluator.
|
||||
*
|
||||
* @param id a unique ID to refer to this information. Used when cache'ing evaluation results.
|
||||
* @param llm the LLM to query (as an LLM specification dict)
|
||||
* @param root_prompt the prompt template to use as the scoring function. Should include exactly one template var, {input}, where input responses will be put.
|
||||
* @param response_ids the cache'd response to run on, which must be a unique ID or list of unique IDs of cache'd data
|
||||
* @param api_keys optional. any api keys to set before running the LLM
|
||||
*/
|
||||
export async function evalWithLLM(id: string,
|
||||
llm: Dict,
|
||||
root_prompt: string,
|
||||
response_ids: string | string[],
|
||||
api_keys?: Dict,
|
||||
progress_listener?: (progress: {[key: symbol]: any}) => void): Promise<Dict> {
|
||||
// Check format of response_ids
|
||||
if (!Array.isArray(response_ids))
|
||||
response_ids = [ response_ids ];
|
||||
response_ids = response_ids as Array<string>;
|
||||
|
||||
if (api_keys) setAPIKeys(api_keys);
|
||||
|
||||
// Load all responses with the given ID:
|
||||
let all_evald_responses: StandardizedLLMResponse[] = [];
|
||||
let all_errors: string[] = [];
|
||||
for (let i = 0; i < response_ids.length; i++) {
|
||||
const cache_id = response_ids[i];
|
||||
const fname = `${cache_id}.json`;
|
||||
if (!StorageCache.has(fname))
|
||||
return {error: `Did not find cache file for id ${cache_id}`};
|
||||
|
||||
// Load the raw responses from the cache + clone them all:
|
||||
const resp_objs = load_cache_responses(fname).map(r => JSON.parse(JSON.stringify(r))) as StandardizedLLMResponse[];
|
||||
if (resp_objs.length === 0)
|
||||
continue;
|
||||
|
||||
// We need to keep track of the index of each response in the response object.
|
||||
// We can generate var dicts with metadata to store the indices:
|
||||
let inputs = resp_objs.map((obj, i) => obj.responses.map((r: string, j: number) => ({text: r, fill_history: {}, metavars: { i, j }}))).flat();
|
||||
|
||||
// Now run all inputs through the LLM grader!:
|
||||
const {responses, errors} = await queryLLM(`eval-${id}-${cache_id}`, [llm], 1, root_prompt, { input: inputs }, undefined, undefined, undefined, progress_listener);
|
||||
|
||||
const err_vals: string[] = Array.from(Object.values(errors)) as string[];
|
||||
if (err_vals.length > 0)
|
||||
all_errors = all_errors.concat(err_vals);
|
||||
|
||||
// Now we need to apply each response as an eval_res (a score) back to each response object,
|
||||
// using the aforementioned mapping metadata:
|
||||
responses.forEach((r: StandardizedLLMResponse) => {
|
||||
let resp_obj = resp_objs[r.metavars.i];
|
||||
if (resp_obj.eval_res !== undefined)
|
||||
resp_obj.eval_res.items[r.metavars.j] = r.responses[0];
|
||||
else {
|
||||
resp_obj.eval_res = {
|
||||
items: [],
|
||||
dtype: 'Categorical',
|
||||
};
|
||||
resp_obj.eval_res.items[r.metavars.j] = r.responses[0];
|
||||
}
|
||||
});
|
||||
|
||||
all_evald_responses = all_evald_responses.concat(resp_objs);
|
||||
}
|
||||
|
||||
// Do additional processing to check if all evaluations are boolean-ish (e.g., 'true' and 'false')
|
||||
let all_eval_res = new Set();
|
||||
for (const resp_obj of all_evald_responses) {
|
||||
if (!resp_obj.eval_res) continue;
|
||||
for (const score of resp_obj.eval_res.items) {
|
||||
if (score !== undefined)
|
||||
all_eval_res.add(score.trim().toLowerCase());
|
||||
}
|
||||
if (all_eval_res.size > 2)
|
||||
break; // it's categorical if size is over 2
|
||||
}
|
||||
if (all_eval_res.size === 2) {
|
||||
// Check if the results are boolean-ish:
|
||||
if ((all_eval_res.has('true') && all_eval_res.has('false')) ||
|
||||
(all_eval_res.has('yes') && all_eval_res.has('no'))) {
|
||||
// Convert all eval results to boolean datatypes:
|
||||
all_evald_responses.forEach(resp_obj => {
|
||||
resp_obj.eval_res.items = resp_obj.eval_res.items.map((i: string) => {
|
||||
const li = i.toLowerCase();
|
||||
return li === 'true' || li === 'yes';
|
||||
});
|
||||
resp_obj.eval_res.dtype = 'Categorical';
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Store the evaluated responses in a new cache json:
|
||||
StorageCache.store(`${id}.json`, all_evald_responses);
|
||||
|
||||
return {responses: all_evald_responses, errors: all_errors};
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Returns all responses with the specified id(s).
|
||||
* @param responses the ids to grab
|
||||
|
@ -51,6 +51,7 @@ export enum LLM {
|
||||
HF_FALCON_7B_INSTRUCT = "tiiuae/falcon-7b-instruct",
|
||||
HF_SANTACODER = "bigcode/santacoder",
|
||||
HF_STARCODER = "bigcode/starcoder",
|
||||
HF_DIALOGPT_LARGE = "microsoft/DialoGPT-large", // chat model
|
||||
// HF_GPTJ_6B = "EleutherAI/gpt-j-6b",
|
||||
// HF_LLAMA_7B = "decapoda-research/llama-7b-hf",
|
||||
|
||||
|
@ -1,16 +1,18 @@
|
||||
import { PromptTemplate, PromptPermutationGenerator } from "./template";
|
||||
import { LLM, RATE_LIMITS } from './models';
|
||||
import { Dict, LLMResponseError, LLMResponseObject, LLMAPICall } from "./typing";
|
||||
import { extract_responses, merge_response_objs, call_llm } from "./utils";
|
||||
import { Dict, LLMResponseError, LLMResponseObject, ChatHistory, isEqualChatHistory, ChatHistoryInfo } from "./typing";
|
||||
import { extract_responses, merge_response_objs, call_llm, mergeDicts } from "./utils";
|
||||
import StorageCache from "./cache";
|
||||
|
||||
const clone = (obj) => JSON.parse(JSON.stringify(obj));
|
||||
|
||||
interface _IntermediateLLMResponseType {
|
||||
prompt: PromptTemplate | string,
|
||||
prompt: PromptTemplate,
|
||||
chat_history?: ChatHistoryInfo,
|
||||
query?: Dict,
|
||||
response?: Dict | LLMResponseError,
|
||||
past_resp_obj?: LLMResponseObject | undefined,
|
||||
past_resp_obj?: LLMResponseObject,
|
||||
past_resp_obj_cache_idx?: number,
|
||||
}
|
||||
|
||||
// From trincot @ SO: https://stackoverflow.com/a/76477994/1911342
|
||||
@ -45,12 +47,62 @@ export class PromptPipeline {
|
||||
this._storageKey = storageKey;
|
||||
}
|
||||
|
||||
*gen_prompts(properties: {[key: string]: any}): Generator<PromptTemplate, boolean, undefined> {
|
||||
*gen_prompts(vars: Dict): Generator<PromptTemplate, boolean, undefined> {
|
||||
const prompt_perm_gen = new PromptPermutationGenerator(this._template);
|
||||
yield* prompt_perm_gen.generate(properties);
|
||||
yield* prompt_perm_gen.generate(vars);
|
||||
return true;
|
||||
}
|
||||
|
||||
private collect_LLM_response(result: _IntermediateLLMResponseType, llm: LLM, cached_responses: Dict): LLMResponseObject | LLMResponseError {
|
||||
let {prompt, chat_history, query, response, past_resp_obj, past_resp_obj_cache_idx} = result;
|
||||
|
||||
// Check for selective failure
|
||||
if (!query && response instanceof LLMResponseError)
|
||||
return response; // yield the LLMResponseException
|
||||
|
||||
// Each prompt has a history of what was filled in from its base template.
|
||||
// This data --like, "class", "language", "library" etc --can be useful when parsing responses.
|
||||
let info = prompt.fill_history;
|
||||
let metavars = prompt.metavars;
|
||||
|
||||
// Create a response obj to represent the response
|
||||
let resp_obj: LLMResponseObject = {
|
||||
"prompt": prompt.toString(),
|
||||
"query": query,
|
||||
"responses": extract_responses(response, llm),
|
||||
"raw_response": response,
|
||||
"llm": llm,
|
||||
"info": mergeDicts(info, chat_history?.fill_history),
|
||||
"metavars": mergeDicts(metavars, chat_history?.metavars),
|
||||
};
|
||||
|
||||
// Carry over the chat history if present:
|
||||
if (chat_history !== undefined)
|
||||
resp_obj.chat_history = chat_history.messages;
|
||||
|
||||
// Merge the response obj with the past one, if necessary
|
||||
if (past_resp_obj)
|
||||
resp_obj = merge_response_objs(resp_obj, past_resp_obj) as LLMResponseObject;
|
||||
|
||||
// Save the current state of cache'd responses to a JSON file
|
||||
// NOTE: We do this to save money --in case something breaks between calls, can ensure we got the data!
|
||||
if (!(resp_obj.prompt in cached_responses))
|
||||
cached_responses[resp_obj.prompt] = [];
|
||||
else if (!Array.isArray(cached_responses[resp_obj.prompt]))
|
||||
cached_responses[resp_obj.prompt] = [ cached_responses[resp_obj.prompt] ];
|
||||
|
||||
if (past_resp_obj_cache_idx > -1)
|
||||
cached_responses[resp_obj.prompt][past_resp_obj_cache_idx] = resp_obj;
|
||||
else
|
||||
cached_responses[resp_obj.prompt].push(resp_obj);
|
||||
this._cache_responses(cached_responses);
|
||||
|
||||
// console.log(` - collected response from ${llm} for prompt: ${resp_obj['prompt']}`);
|
||||
|
||||
// Yield the response
|
||||
return resp_obj;
|
||||
}
|
||||
|
||||
/**
|
||||
Calls LLM 'llm' with all prompts, and yields responses as dicts in format {prompt, query, response, llm, info}.
|
||||
|
||||
@ -65,142 +117,114 @@ export class PromptPipeline {
|
||||
NOTE: The reason we collect, rather than raise, LLMResponseExceptions is because some API calls
|
||||
may still succeed, even if some fail. We don't want to stop listening to pending API calls,
|
||||
because we may lose money. Instead, we fail selectively.
|
||||
|
||||
* @param vars The 'vars' dict to fill variables in the root prompt template. For instance, for 'Who is {person}?', vars might = { person: ['TJ', 'MJ', 'AD'] }.
|
||||
* @param llm The specific LLM model to call. See the LLM enum for supported models.
|
||||
* @param n How many generations per prompt sent to the LLM.
|
||||
* @param temperature The temperature to use when querying the LLM.
|
||||
* @param llm_params Optional. The model-specific settings to pass into the LLM API call. Varies by LLM.
|
||||
* @param chat_histories Optional. A list of chat histories, with messages in OpenAI format. When present, calculates the cross product:
|
||||
* queries = (prompts) X (chat_histories)
|
||||
* to generate individual queries to LLMs. For instance, wish the prompt 'Who is {person}?', 3 values for person,
|
||||
* and 3 different prior chat histories, it will send off 9 queries.
|
||||
* @yields Yields `LLMResponseObject` if API call succeeds, or `LLMResponseError` if API call fails, for all requests.
|
||||
*/
|
||||
async *gen_responses(properties: {[key: string]: any},
|
||||
llm: LLM,
|
||||
n: number = 1,
|
||||
temperature: number = 1.0,
|
||||
llm_params?: Dict): AsyncGenerator<LLMResponseObject | LLMResponseError, boolean, undefined> {
|
||||
// Double-check that properties is the correct type (JSON dict):
|
||||
async *gen_responses(vars: Dict,
|
||||
llm: LLM,
|
||||
n: number = 1,
|
||||
temperature: number = 1.0,
|
||||
llm_params?: Dict,
|
||||
chat_histories?: ChatHistoryInfo[]): AsyncGenerator<LLMResponseObject | LLMResponseError, boolean, undefined> {
|
||||
// Load any cache'd responses
|
||||
let responses = this._load_cached_responses();
|
||||
|
||||
// Normalize the chat history var such that there's always at least one element.
|
||||
const _chat_histories = (chat_histories !== undefined && chat_histories.length > 0) ? chat_histories : [ undefined ];
|
||||
|
||||
// Query LLM with each prompt, yield + cache the responses
|
||||
let tasks: Array<Promise<_IntermediateLLMResponseType>> = [];
|
||||
const rate_limit = RATE_LIMITS[llm] || [1, 0];
|
||||
let [max_req, wait_secs] = rate_limit ? rate_limit : [1, 0];
|
||||
let num_queries_sent = -1;
|
||||
|
||||
for (const prompt of this.gen_prompts(properties)) {
|
||||
// Generate concrete prompts one by one. Yield response from the cache or make async call to LLM.
|
||||
for (const prompt of this.gen_prompts(vars)) {
|
||||
if (!prompt.is_concrete())
|
||||
throw Error(`Cannot send a prompt '${prompt}' to LLM: Prompt is a template.`)
|
||||
|
||||
const prompt_str = prompt.toString();
|
||||
let info = prompt.fill_history;
|
||||
let metavars = prompt.metavars;
|
||||
const info = prompt.fill_history;
|
||||
const metavars = prompt.metavars;
|
||||
|
||||
let cached_resp = prompt_str in responses ? responses[prompt_str] : undefined;
|
||||
let extracted_resps: Array<any> = cached_resp ? cached_resp["responses"] : [];
|
||||
|
||||
// First check if there is already a response for this item under these settings. If so, we can save an LLM call:
|
||||
if (cached_resp && extracted_resps.length >= n) {
|
||||
// console.log(` - Found cache'd response for prompt ${prompt_str}. Using...`);
|
||||
yield ({
|
||||
"prompt": prompt_str,
|
||||
"query": cached_resp["query"],
|
||||
"responses": extracted_resps.slice(0, n),
|
||||
"raw_response": cached_resp["raw_response"],
|
||||
"llm": cached_resp["llm"] || LLM.OpenAI_ChatGPT,
|
||||
// We want to use the new info, since 'vars' could have changed even though
|
||||
// the prompt text is the same (e.g., "this is a tool -> this is a {x} where x='tool'")
|
||||
"info": info,
|
||||
"metavars": metavars,
|
||||
});
|
||||
continue;
|
||||
}
|
||||
// Get the cache of responses with respect to this prompt, + normalize format so it's always an array (of size >= 0)
|
||||
const cache_bucket = responses[prompt_str];
|
||||
let cached_resps: LLMResponseObject[] = Array.isArray(cache_bucket) ? cache_bucket : (cache_bucket === undefined ? [] : [ cache_bucket ]);
|
||||
|
||||
num_queries_sent += 1;
|
||||
// Loop over any present chat histories. (If none, will have a single pass with 'undefined' as chat_history value.)
|
||||
for (const chat_history of _chat_histories) {
|
||||
|
||||
if (max_req > 1) {
|
||||
// Call the LLM asynchronously to generate a response, sending off
|
||||
// requests in batches of size 'max_req' separated by seconds 'wait_secs' to avoid hitting rate limit
|
||||
tasks.push(this._prompt_llm(llm, prompt, n, temperature,
|
||||
cached_resp,
|
||||
num_queries_sent,
|
||||
max_req,
|
||||
wait_secs,
|
||||
llm_params));
|
||||
} else {
|
||||
// Block. Await + yield a single LLM call.
|
||||
let result = await this._prompt_llm(llm, prompt, n, temperature, cached_resp,
|
||||
undefined, undefined, undefined,
|
||||
llm_params);
|
||||
let { query, response, past_resp_obj } = result;
|
||||
|
||||
// Check for selective failure
|
||||
if (!query && response instanceof LLMResponseError) {
|
||||
yield response; // yield the LLMResponseError
|
||||
// Check if there's a cached response with the same prompt + (if present) chat history:
|
||||
let cached_resp: LLMResponseObject | undefined = undefined;
|
||||
let cached_resp_idx: number = -1;
|
||||
// Find an indivdual response obj that matches the chat history:
|
||||
for (let i = 0; i < cached_resps.length; i++) {
|
||||
if (isEqualChatHistory(cached_resps[i].chat_history, chat_history?.messages)) {
|
||||
cached_resp = cached_resps[i];
|
||||
cached_resp_idx = i;
|
||||
break;
|
||||
}
|
||||
}
|
||||
let extracted_resps: Array<any> = cached_resp ? cached_resp["responses"] : [];
|
||||
|
||||
// First check if there is already a response for this item under these settings. If so, we can save an LLM call:
|
||||
if (cached_resp && extracted_resps.length >= n) {
|
||||
// console.log(` - Found cache'd response for prompt ${prompt_str}. Using...`);
|
||||
let resp: LLMResponseObject = {
|
||||
"prompt": prompt_str,
|
||||
"query": cached_resp["query"],
|
||||
"responses": extracted_resps.slice(0, n),
|
||||
"raw_response": cached_resp["raw_response"],
|
||||
"llm": cached_resp["llm"] || LLM.OpenAI_ChatGPT,
|
||||
// We want to use the new info, since 'vars' could have changed even though
|
||||
// the prompt text is the same (e.g., "this is a tool -> this is a {x} where x='tool'")
|
||||
"info": mergeDicts(info, chat_history?.fill_history),
|
||||
"metavars": mergeDicts(metavars, chat_history?.metavars),
|
||||
};
|
||||
if (chat_history !== undefined)
|
||||
resp.chat_history = chat_history.messages;
|
||||
yield resp;
|
||||
continue;
|
||||
}
|
||||
|
||||
// We now know there was a response; type it correctly:
|
||||
query = query as Dict;
|
||||
response = response as Dict;
|
||||
num_queries_sent += 1;
|
||||
|
||||
// Create a response obj to represent the response
|
||||
let resp_obj: LLMResponseObject = {
|
||||
prompt: prompt.toString(),
|
||||
query: query,
|
||||
responses: extract_responses(response, llm),
|
||||
raw_response: response,
|
||||
llm: llm,
|
||||
info: info,
|
||||
metavars: metavars,
|
||||
if (max_req > 1) {
|
||||
// Call the LLM asynchronously to generate a response, sending off
|
||||
// requests in batches of size 'max_req' separated by seconds 'wait_secs' to avoid hitting rate limit
|
||||
tasks.push(this._prompt_llm(llm, prompt, n, temperature,
|
||||
cached_resp,
|
||||
cached_resp_idx,
|
||||
num_queries_sent,
|
||||
max_req,
|
||||
wait_secs,
|
||||
llm_params,
|
||||
chat_history));
|
||||
} else {
|
||||
// Block. Await + yield a single LLM call.
|
||||
let result = await this._prompt_llm(llm, prompt, n, temperature,
|
||||
cached_resp, cached_resp_idx,
|
||||
undefined, undefined, undefined,
|
||||
llm_params, chat_history);
|
||||
yield this.collect_LLM_response(result, llm, responses);
|
||||
}
|
||||
|
||||
// Merge the response obj with the past one, if necessary
|
||||
if (past_resp_obj)
|
||||
resp_obj = merge_response_objs(resp_obj, past_resp_obj) as LLMResponseObject;
|
||||
|
||||
// Save the current state of cache'd responses to a JSON file
|
||||
responses[resp_obj["prompt"]] = resp_obj;
|
||||
this._cache_responses(responses);
|
||||
|
||||
// console.log(` - collected response from ${llm} for prompt: ${resp_obj['prompt']}`);
|
||||
|
||||
// Yield the response
|
||||
yield resp_obj;
|
||||
}
|
||||
}
|
||||
|
||||
// Yield responses as they come in
|
||||
for await (const {prompt, query, response, past_resp_obj} of yield_as_completed(tasks)) {
|
||||
// Check for selective failure
|
||||
if (!query && response instanceof LLMResponseError) {
|
||||
yield response; // yield the LLMResponseException
|
||||
continue;
|
||||
}
|
||||
|
||||
// Each prompt has a history of what was filled in from its base template.
|
||||
// This data --like, "class", "language", "library" etc --can be useful when parsing responses.
|
||||
let info = prompt.fill_history;
|
||||
let metavars = prompt.metavars;
|
||||
|
||||
// Create a response obj to represent the response
|
||||
let resp_obj: LLMResponseObject = {
|
||||
"prompt": prompt.toString(),
|
||||
"query": query,
|
||||
"responses": extract_responses(response, llm),
|
||||
"raw_response": response,
|
||||
"llm": llm,
|
||||
"info": info,
|
||||
"metavars": metavars,
|
||||
};
|
||||
|
||||
// Merge the response obj with the past one, if necessary
|
||||
if (past_resp_obj)
|
||||
resp_obj = merge_response_objs(resp_obj, past_resp_obj) as LLMResponseObject;
|
||||
|
||||
// Save the current state of cache'd responses to a JSON file
|
||||
// NOTE: We do this to save money --in case something breaks between calls, can ensure we got the data!
|
||||
responses[resp_obj["prompt"]] = resp_obj;
|
||||
this._cache_responses(responses);
|
||||
|
||||
// console.log(` - collected response from ${llm} for prompt: ${resp_obj['prompt']}`);
|
||||
|
||||
// Yield the response
|
||||
yield resp_obj;
|
||||
for await (const result of yield_as_completed(tasks)) {
|
||||
yield this.collect_LLM_response(result, llm, responses);
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -208,7 +232,7 @@ export class PromptPipeline {
|
||||
* Loads cache'd responses of JSON.
|
||||
* Useful for continuing if computation was interrupted halfway through.
|
||||
*/
|
||||
_load_cached_responses(): {[key: string]: LLMResponseObject} {
|
||||
_load_cached_responses(): {[key: string]: (LLMResponseObject | LLMResponseObject[])} {
|
||||
return StorageCache.get(this._storageKey) || {};
|
||||
}
|
||||
|
||||
@ -225,10 +249,12 @@ export class PromptPipeline {
|
||||
n: number = 1,
|
||||
temperature: number = 1.0,
|
||||
past_resp_obj?: LLMResponseObject,
|
||||
past_resp_obj_cache_idx?: number,
|
||||
query_number?: number,
|
||||
rate_limit_batch_size?: number,
|
||||
rate_limit_wait_secs?: number,
|
||||
llm_params?: Dict): Promise<_IntermediateLLMResponseType> {
|
||||
llm_params?: Dict,
|
||||
chat_history?: ChatHistoryInfo): Promise<_IntermediateLLMResponseType> {
|
||||
// Detect how many responses we have already (from cache obj past_resp_obj)
|
||||
if (past_resp_obj) {
|
||||
// How many *new* queries we need to send:
|
||||
@ -251,20 +277,25 @@ export class PromptPipeline {
|
||||
|
||||
// Now try to call the API. If it fails for whatever reason, 'soft fail' by returning
|
||||
// an LLMResponseException object as the 'response'.
|
||||
let params = clone(llm_params);
|
||||
if (chat_history !== undefined) params.chat_history = chat_history.messages;
|
||||
let query: Dict | undefined;
|
||||
let response: Dict | LLMResponseError;
|
||||
try {
|
||||
[query, response] = await call_llm(llm, prompt.toString(), n, temperature, clone(llm_params));
|
||||
[query, response] = await call_llm(llm, prompt.toString(), n, temperature, params);
|
||||
} catch(err) {
|
||||
return { prompt: prompt,
|
||||
query: undefined,
|
||||
response: new LLMResponseError(err.message),
|
||||
past_resp_obj: undefined };
|
||||
past_resp_obj: undefined,
|
||||
past_resp_obj_cache_idx: -1 };
|
||||
}
|
||||
|
||||
return { prompt,
|
||||
chat_history,
|
||||
query,
|
||||
response,
|
||||
past_resp_obj };
|
||||
past_resp_obj,
|
||||
past_resp_obj_cache_idx };
|
||||
}
|
||||
}
|
||||
|
@ -11,6 +11,67 @@ export class LLMResponseError extends Error {
|
||||
export interface Dict { [key: string]: any };
|
||||
export interface StringDict { [key: string]: string };
|
||||
|
||||
/** OpenAI function call format */
|
||||
export interface OpenAIFunctionCall {
|
||||
name: string,
|
||||
parameters: Dict,
|
||||
description?: string,
|
||||
}
|
||||
|
||||
/** The outputs of prompt nodes, text fields or other data passed internally in the front-end.
|
||||
* Used to populate prompt templates and carry variables/metavariables along the chain. */
|
||||
export interface TemplateVarInfo {
|
||||
text: string,
|
||||
fill_history: Dict,
|
||||
metavars?: Dict,
|
||||
}
|
||||
|
||||
/** OpenAI chat message format */
|
||||
export interface ChatMessage {
|
||||
role: string,
|
||||
content: string,
|
||||
name?: string,
|
||||
function_call?: OpenAIFunctionCall,
|
||||
}
|
||||
export type ChatHistory = ChatMessage[];
|
||||
|
||||
/** Google PaLM chat message format */
|
||||
export interface PaLMChatMessage {
|
||||
author: string, // usually, 0=user and 1=AI
|
||||
content: string,
|
||||
}
|
||||
export interface PaLMChatContext {
|
||||
messages: PaLMChatMessage[],
|
||||
context?: string,
|
||||
examples?: Dict[],
|
||||
}
|
||||
|
||||
/** HuggingFace conversation models format */
|
||||
export interface HuggingFaceChatHistory {
|
||||
past_user_inputs: string[],
|
||||
generated_responses: string[]
|
||||
}
|
||||
|
||||
// Chat history with 'carried' variable metadata
|
||||
export interface ChatHistoryInfo {
|
||||
messages: ChatHistory,
|
||||
fill_history: Dict,
|
||||
metavars?: Dict,
|
||||
llm?: string,
|
||||
}
|
||||
|
||||
export function isEqualChatHistory(A: ChatHistory | undefined, B: ChatHistory | undefined): boolean {
|
||||
if (A === undefined && B === undefined) return true;
|
||||
if (A === undefined || B === undefined) return false;
|
||||
if (A.length !== B.length) return false;
|
||||
if (A.length === 0) return true; // both empty
|
||||
return A.every((a, i) => {
|
||||
const b = B[i];
|
||||
return (a.role === b.role && a.content === b.content &&
|
||||
a.name === b.name && a.function_call === b.function_call);
|
||||
});
|
||||
}
|
||||
|
||||
/** A JSON object describing an LLM response for the same prompt, with n responses (n>=1) */
|
||||
export interface LLMResponseObject {
|
||||
prompt: string;
|
||||
@ -20,6 +81,7 @@ export interface LLMResponseObject {
|
||||
llm: LLM;
|
||||
info: Dict;
|
||||
metavars: Dict;
|
||||
chat_history?: ChatHistory;
|
||||
}
|
||||
|
||||
/** A standard async function interface for calling an LLM. */
|
||||
@ -40,4 +102,5 @@ export interface StandardizedLLMResponse {
|
||||
metavars: Dict,
|
||||
tokens: Dict,
|
||||
eval_res?: Dict,
|
||||
chat_history?: ChatHistory,
|
||||
}
|
@ -4,7 +4,7 @@
|
||||
|
||||
// from chainforge.promptengine.models import LLM
|
||||
import { LLM, LLMProvider, getProvider } from './models';
|
||||
import { Dict, StringDict, LLMAPICall, LLMResponseObject } from './typing';
|
||||
import { Dict, StringDict, LLMAPICall, LLMResponseObject, ChatHistory, ChatMessage, PaLMChatMessage, PaLMChatContext, HuggingFaceChatHistory } from './typing';
|
||||
import { env as process_env } from 'process';
|
||||
import { StringTemplate } from './template';
|
||||
|
||||
@ -12,8 +12,8 @@ import { StringTemplate } from './template';
|
||||
import { Configuration as OpenAIConfig, OpenAIApi } from "openai";
|
||||
import { OpenAIClient as AzureOpenAIClient, AzureKeyCredential } from "@azure/openai";
|
||||
|
||||
const HUMAN_PROMPT = "\n\nHuman:";
|
||||
const AI_PROMPT = "\n\nAssistant:";
|
||||
const ANTHROPIC_HUMAN_PROMPT = "\n\nHuman:";
|
||||
const ANTHROPIC_AI_PROMPT = "\n\nAssistant:";
|
||||
|
||||
const fetch = require('node-fetch');
|
||||
|
||||
@ -120,6 +120,29 @@ export function set_api_keys(api_keys: StringDict): void {
|
||||
// Soft fail for non-present keys
|
||||
}
|
||||
|
||||
/**
|
||||
* Construct an OpenAI format chat history for sending off to an OpenAI API call.
|
||||
* @param prompt The next prompt (user message) to append.
|
||||
* @param chat_history The prior turns of the chat, ending with the AI assistants' turn.
|
||||
* @param system_msg Optional; the system message to use if none is present in chat_history. (Ignored if chat_history already has a sys message.)
|
||||
*/
|
||||
function construct_openai_chat_history(prompt: string, chat_history: ChatHistory | undefined, system_msg: string): ChatHistory {
|
||||
const prompt_msg: ChatMessage = { role: 'user', content: prompt };
|
||||
if (chat_history !== undefined && chat_history.length > 0) {
|
||||
if (chat_history[0].role === 'system') {
|
||||
// In this case, the system_msg is ignored because the prior history already contains one.
|
||||
return chat_history.concat([prompt_msg]);
|
||||
} else {
|
||||
// In this case, there's no system message that starts the prior history, so inject one:
|
||||
// NOTE: We might reach this scenario if we chain output of a non-OpenAI chat model into an OpenAI model.
|
||||
return [{"role": "system", "content": system_msg}].concat(chat_history).concat([prompt_msg]);
|
||||
}
|
||||
} else return [
|
||||
{"role": "system", "content": system_msg},
|
||||
prompt_msg,
|
||||
];
|
||||
}
|
||||
|
||||
/**
|
||||
* Calls OpenAI models via OpenAI's API.
|
||||
@returns raw query and response JSON dicts.
|
||||
@ -142,13 +165,16 @@ export async function call_chatgpt(prompt: string, model: LLM, n: number = 1, te
|
||||
delete params.stop;
|
||||
if (params?.functions !== undefined && (!Array.isArray(params.functions) || params.functions.length === 0))
|
||||
delete params?.functions;
|
||||
if (params?.function_call !== undefined && ((!(typeof params.function_call === 'string')) || params.function_call.trim().length === 0)) {
|
||||
if (params?.function_call !== undefined && ((!(typeof params.function_call === 'string')) || params.function_call.trim().length === 0))
|
||||
delete params.function_call;
|
||||
}
|
||||
|
||||
console.log(`Querying OpenAI model '${model}' with prompt '${prompt}'...`);
|
||||
|
||||
// Determine the system message and whether there's chat history to continue:
|
||||
const chat_history: ChatHistory | undefined = params?.chat_history;
|
||||
const system_msg: string = params?.system_msg !== undefined ? params.system_msg : "You are a helpful assistant.";
|
||||
delete params?.system_msg;
|
||||
delete params?.chat_history;
|
||||
|
||||
let query: Dict = {
|
||||
model: modelname,
|
||||
@ -159,15 +185,16 @@ export async function call_chatgpt(prompt: string, model: LLM, n: number = 1, te
|
||||
|
||||
// Get the correct function to call
|
||||
let openai_call: any;
|
||||
if (modelname.includes('davinci')) { // text completions model
|
||||
if (modelname.includes('davinci')) {
|
||||
// Create call to text completions model
|
||||
openai_call = openai.createCompletion.bind(openai);
|
||||
query['prompt'] = prompt;
|
||||
} else { // chat model
|
||||
} else {
|
||||
// Create call to chat model
|
||||
openai_call = openai.createChatCompletion.bind(openai);
|
||||
query['messages'] = [
|
||||
{"role": "system", "content": system_msg},
|
||||
{"role": "user", "content": prompt},
|
||||
]
|
||||
|
||||
// Carry over chat history, if present:
|
||||
query['messages'] = construct_openai_chat_history(prompt, chat_history, system_msg);
|
||||
}
|
||||
|
||||
// Try to call OpenAI
|
||||
@ -217,8 +244,9 @@ export async function call_azure_openai(prompt: string, model: LLM, n: number =
|
||||
delete params.function_call;
|
||||
|
||||
console.log(`Querying Azure OpenAI deployed model '${deployment_name}' at endpoint '${AZURE_OPENAI_ENDPOINT}' with prompt '${prompt}'...`)
|
||||
const chat_history: ChatHistory | undefined = params?.chat_history;
|
||||
const system_msg = params?.system_msg !== undefined ? params.system_msg : "You are a helpful assistant.";
|
||||
|
||||
delete params?.chat_history;
|
||||
delete params?.system_msg;
|
||||
delete params?.model_type;
|
||||
delete params?.deployment_name;
|
||||
@ -236,10 +264,7 @@ export async function call_azure_openai(prompt: string, model: LLM, n: number =
|
||||
arg2 = [prompt];
|
||||
} else {
|
||||
openai_call = client.getChatCompletions.bind(client);
|
||||
arg2 = [
|
||||
{"role": "system", "content": system_msg},
|
||||
{"role": "user", "content": prompt},
|
||||
];
|
||||
arg2 = construct_openai_chat_history(prompt, chat_history, system_msg);
|
||||
}
|
||||
|
||||
let response: Dict = {};
|
||||
@ -274,18 +299,34 @@ export async function call_anthropic(prompt: string, model: LLM, n: number = 1,
|
||||
throw Error("Could not find an API key for Anthropic models. Double-check that your API key is set in Settings or in your local environment.");
|
||||
|
||||
// Wrap the prompt in the provided template, or use the default Anthropic one
|
||||
const custom_prompt_wrapper: string = params?.custom_prompt_wrapper || (HUMAN_PROMPT + " {prompt}" + AI_PROMPT);
|
||||
const custom_prompt_wrapper: string = params?.custom_prompt_wrapper || (ANTHROPIC_HUMAN_PROMPT + " {prompt}" + ANTHROPIC_AI_PROMPT);
|
||||
if (!custom_prompt_wrapper.includes('{prompt}'))
|
||||
throw Error("Custom prompt wrapper is missing required {prompt} template variable.");
|
||||
const prompt_wrapper_template = new StringTemplate(custom_prompt_wrapper);
|
||||
const wrapped_prompt = prompt_wrapper_template.safe_substitute({prompt: prompt});
|
||||
let wrapped_prompt = prompt_wrapper_template.safe_substitute({prompt: prompt});
|
||||
|
||||
if (params?.custom_prompt_wrapper !== undefined)
|
||||
delete params.custom_prompt_wrapper;
|
||||
|
||||
// Required non-standard params
|
||||
const max_tokens_to_sample = params?.max_tokens_to_sample || 1024;
|
||||
const stop_sequences = params?.stop_sequences || [HUMAN_PROMPT];
|
||||
const stop_sequences = params?.stop_sequences || [ANTHROPIC_HUMAN_PROMPT];
|
||||
|
||||
// Carry chat history by prepending it to the prompt
|
||||
// :: See https://docs.anthropic.com/claude/docs/human-and-assistant-formatting#use-human-and-assistant-to-put-words-in-claudes-mouth
|
||||
if (params?.chat_history !== undefined) {
|
||||
const chat_history: ChatHistory = params.chat_history as ChatHistory;
|
||||
let anthr_chat_context: string = "";
|
||||
for (const chat_msg of chat_history) {
|
||||
if (chat_msg.role === 'user')
|
||||
anthr_chat_context += ANTHROPIC_HUMAN_PROMPT;
|
||||
else if (chat_msg.role === 'assistant')
|
||||
anthr_chat_context += ANTHROPIC_AI_PROMPT;
|
||||
else continue; // ignore system messages and other roles
|
||||
anthr_chat_context += ' ' + chat_msg.content;
|
||||
}
|
||||
wrapped_prompt = anthr_chat_context + wrapped_prompt; // prepend the chat context
|
||||
}
|
||||
|
||||
// Format query
|
||||
let query = {
|
||||
@ -354,6 +395,8 @@ export async function call_google_palm(prompt: string, model: LLM, n: number = 1
|
||||
|
||||
// Required non-standard params
|
||||
const max_output_tokens = params?.max_output_tokens || 800;
|
||||
const chat_history = params?.chat_history;
|
||||
delete params?.chat_history;
|
||||
|
||||
let query: Dict = {
|
||||
model: `models/${model}`,
|
||||
@ -392,7 +435,25 @@ export async function call_google_palm(prompt: string, model: LLM, n: number = 1
|
||||
|
||||
if (is_chat_model) {
|
||||
// Chat completions
|
||||
query.prompt = { messages: [{content: prompt}] };
|
||||
if (chat_history !== undefined && chat_history.length > 0) {
|
||||
// Carry over any chat history, converting OpenAI formatted chat history to Google PaLM:
|
||||
let palm_chat_context: PaLMChatContext = { messages: [] };
|
||||
let palm_messages: PaLMChatMessage[] = [];
|
||||
for (const chat_msg of chat_history) {
|
||||
if (chat_msg.role === 'system') {
|
||||
// Carry the system message over as PaLM's chat 'context':
|
||||
palm_chat_context.context = chat_msg.content;
|
||||
} else if (chat_msg.role === 'user') {
|
||||
palm_messages.push({ author: '0', content: chat_msg.content });
|
||||
} else
|
||||
palm_messages.push({ author: '1', content: chat_msg.content });
|
||||
}
|
||||
palm_messages.push({ author: '0', content: prompt });
|
||||
palm_chat_context.messages = palm_messages;
|
||||
query.prompt = palm_chat_context;
|
||||
} else {
|
||||
query.prompt = { messages: [{content: prompt}] };
|
||||
}
|
||||
} else {
|
||||
// Text completions
|
||||
query.prompt = { text: prompt };
|
||||
@ -470,20 +531,38 @@ export async function call_huggingface(prompt: string, model: LLM, n: number = 1
|
||||
if (params?.num_continuations !== undefined && typeof params.num_continuations === 'number')
|
||||
num_continuations = params.num_continuations;
|
||||
|
||||
let query = {
|
||||
let query: Dict = {
|
||||
temperature: temperature,
|
||||
return_full_text: false, // we never want it to include the prompt in the response
|
||||
};
|
||||
set_param_if_exists('top_k', query);
|
||||
set_param_if_exists('top_p', query);
|
||||
set_param_if_exists('repetition_penalty', query);
|
||||
set_param_if_exists('max_new_tokens', query);
|
||||
|
||||
let options = {
|
||||
use_cache: false, // we want it generating fresh each time
|
||||
};
|
||||
set_param_if_exists('use_cache', options);
|
||||
set_param_if_exists('do_sample', options);
|
||||
|
||||
// Carry over chat history if (a) we're using a chat model and (b) if it exists, converting to HF format.
|
||||
// :: See https://huggingface.co/docs/api-inference/detailed_parameters#conversational-task
|
||||
const model_type: string = params?.model_type;
|
||||
let hf_chat_hist: HuggingFaceChatHistory = { past_user_inputs: [], generated_responses: [] };
|
||||
if (model_type === 'chat') {
|
||||
if (params?.chat_history !== undefined) {
|
||||
for (const chat_msg of params.chat_history as ChatHistory) {
|
||||
if (chat_msg.role === 'user')
|
||||
hf_chat_hist.past_user_inputs = hf_chat_hist.past_user_inputs.concat( chat_msg.content );
|
||||
else if (chat_msg.role === 'assistant')
|
||||
hf_chat_hist.generated_responses = hf_chat_hist.generated_responses.concat( chat_msg.content );
|
||||
// ignore system messages
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// Text generation-only parameters:
|
||||
set_param_if_exists('max_new_tokens', query);
|
||||
set_param_if_exists('do_sample', options);
|
||||
query.return_full_text = false; // we never want it to include the prompt in the response
|
||||
}
|
||||
|
||||
const using_custom_model_endpoint: boolean = param_exists(params?.custom_model);
|
||||
|
||||
@ -505,11 +584,17 @@ export async function call_huggingface(prompt: string, model: LLM, n: number = 1
|
||||
let curr_cont = 0;
|
||||
let curr_text = prompt;
|
||||
while (curr_cont <= num_continuations) {
|
||||
const inputs = (model_type === 'chat')
|
||||
? ({ text: curr_text,
|
||||
past_user_inputs: hf_chat_hist.past_user_inputs,
|
||||
generated_responses: hf_chat_hist.generated_responses })
|
||||
: curr_text;
|
||||
|
||||
// Call HuggingFace inference API
|
||||
const response = await fetch(url, {
|
||||
headers: headers,
|
||||
method: "POST",
|
||||
body: JSON.stringify({inputs: curr_text, parameters: query, options: options}),
|
||||
body: JSON.stringify({inputs: inputs, parameters: query, options: options}),
|
||||
});
|
||||
const result = await response.json();
|
||||
|
||||
@ -517,11 +602,12 @@ export async function call_huggingface(prompt: string, model: LLM, n: number = 1
|
||||
// It returns this as an 'error' key in the response:
|
||||
if (result?.error !== undefined)
|
||||
throw new Error(result.error);
|
||||
else if (!Array.isArray(result) || result.length !== 1)
|
||||
else if ((model_type !== 'chat' && (!Array.isArray(result) || result.length !== 1)) ||
|
||||
(model_type === 'chat' && (Array.isArray(result) || !result || result?.generated_text === undefined)))
|
||||
throw new Error("Result of HuggingFace API call is in unexpected format:" + JSON.stringify(result));
|
||||
|
||||
// Merge responses
|
||||
const resp_text: string = result[0].generated_text;
|
||||
const resp_text: string = model_type === 'chat' ? result.generated_text : result[0].generated_text;
|
||||
|
||||
continued_response.generated_text += resp_text;
|
||||
curr_text += resp_text;
|
||||
@ -684,7 +770,7 @@ export function merge_response_objs(resp_obj_A: LLMResponseObject | undefined, r
|
||||
raw_resp_A = [ raw_resp_A ];
|
||||
if (!Array.isArray(raw_resp_B))
|
||||
raw_resp_B = [ raw_resp_B ];
|
||||
return {
|
||||
let res: LLMResponseObject = {
|
||||
responses: resp_obj_A.responses.concat(resp_obj_B.responses),
|
||||
raw_response: raw_resp_A.concat(raw_resp_B),
|
||||
prompt: resp_obj_B.prompt,
|
||||
@ -693,6 +779,19 @@ export function merge_response_objs(resp_obj_A: LLMResponseObject | undefined, r
|
||||
info: resp_obj_B.info,
|
||||
metavars: resp_obj_B.metavars,
|
||||
};
|
||||
if (resp_obj_B.chat_history !== undefined)
|
||||
res.chat_history = resp_obj_B.chat_history;
|
||||
return res;
|
||||
}
|
||||
|
||||
export function mergeDicts(A?: Dict, B?: Dict): Dict | undefined {
|
||||
if (A === undefined && B === undefined) return undefined;
|
||||
else if (A === undefined) return B;
|
||||
else if (B === undefined) return A;
|
||||
let d: Dict = {};
|
||||
Object.entries(A).forEach(([key, val]) => { d[key] = val; });
|
||||
Object.entries(B).forEach(([key, val]) => { d[key] = val; });
|
||||
return d; // gives priority to B
|
||||
}
|
||||
|
||||
export const filterDict = (dict: Dict, keyFilterFunc: (key: string) => boolean) => {
|
||||
|
@ -1,7 +1,8 @@
|
||||
import { queryLLM, executejs, executepy,
|
||||
fetchExampleFlow, fetchOpenAIEval, importCache,
|
||||
exportCache, countQueries, grabResponses,
|
||||
createProgressFile, generatePrompts} from "./backend/backend";
|
||||
generatePrompts,
|
||||
evalWithLLM} from "./backend/backend";
|
||||
|
||||
const clone = (obj) => JSON.parse(JSON.stringify(obj));
|
||||
|
||||
@ -10,17 +11,17 @@ async function _route_to_js_backend(route, params) {
|
||||
case 'grabResponses':
|
||||
return grabResponses(params.responses);
|
||||
case 'countQueriesRequired':
|
||||
return countQueries(params.prompt, clone(params.vars), clone(params.llms), params.n, params.id);
|
||||
return countQueries(params.prompt, clone(params.vars), clone(params.llms), params.n, params.chat_histories, params.id);
|
||||
case 'generatePrompts':
|
||||
return generatePrompts(params.prompt, clone(params.vars));
|
||||
case 'createProgressFile':
|
||||
return createProgressFile(params.id);
|
||||
case 'queryllm':
|
||||
return queryLLM(params.id, clone(params.llm), params.n, params.prompt, clone(params.vars), params.api_keys, params.no_cache, params.progress_listener);
|
||||
return queryLLM(params.id, clone(params.llm), params.n, params.prompt, clone(params.vars), params.chat_histories, params.api_keys, params.no_cache, params.progress_listener);
|
||||
case 'executejs':
|
||||
return executejs(params.id, params.code, params.responses, params.scope);
|
||||
case 'executepy':
|
||||
return executepy(params.id, params.code, params.responses, params.scope, params.script_paths);
|
||||
case 'evalWithLLM':
|
||||
return evalWithLLM(params.id, params.llm, params.root_prompt, params.responses, params.api_keys, params.progress_listener);
|
||||
case 'importCache':
|
||||
return importCache(params.files);
|
||||
case 'exportCache':
|
||||
|
2
chainforge/react-server/src/store.js
vendored
2
chainforge/react-server/src/store.js
vendored
@ -25,7 +25,7 @@ export const colorPalettes = {
|
||||
var: varColorPalette,
|
||||
}
|
||||
|
||||
const refreshableOutputNodeTypes = new Set(['evaluator', 'prompt', 'inspect', 'vis', 'textfields']);
|
||||
const refreshableOutputNodeTypes = new Set(['evaluator', 'prompt', 'inspect', 'vis', 'llmeval', 'textfields', 'chat']);
|
||||
|
||||
// A global store of variables, used for maintaining state
|
||||
// across ChainForge and ReactFlow components.
|
||||
|
@ -107,6 +107,21 @@
|
||||
transform: rotate(360deg);
|
||||
}
|
||||
}
|
||||
|
||||
.llm-list-container {
|
||||
background-color: #eee;
|
||||
border-radius: 4px;
|
||||
padding: 8px;
|
||||
overflow-y: auto;
|
||||
max-height: 205px;
|
||||
}
|
||||
.llm-list-backdrop {
|
||||
margin: 6px 0px 6px 6px;
|
||||
padding-bottom: 4px;
|
||||
text-align: left;
|
||||
font-size: 10pt;
|
||||
color: #777;
|
||||
}
|
||||
|
||||
.input-field {
|
||||
display: flex;
|
||||
@ -294,6 +309,7 @@
|
||||
.inspect-table-llm-resp {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 20px;
|
||||
min-width: 160px;
|
||||
border-right: 1px solid #eee;
|
||||
}
|
||||
.response-group-component-header:hover {
|
||||
@ -685,7 +701,7 @@
|
||||
border-radius: 5px;
|
||||
border: 1px solid #999;
|
||||
font-size: 12px;
|
||||
background: transparent;
|
||||
background: rgb(63, 204, 106, 0.2);
|
||||
color: #333;
|
||||
overflow: hidden;
|
||||
box-shadow: 0 0 0 0 transparent;
|
||||
@ -859,6 +875,46 @@
|
||||
}
|
||||
}
|
||||
|
||||
/* Chat history bubbles */
|
||||
.chat-bubble {
|
||||
--r: 15px; /* the radius */
|
||||
--t: 25px; /* the size of the tail */
|
||||
|
||||
max-width: 80%;
|
||||
padding: calc(2*var(--r)/3);
|
||||
mask:
|
||||
radial-gradient(var(--t) at var(--_d) 0,#0000 98%,#000 102%)
|
||||
var(--_d) 100%/calc(100% - var(--r)) var(--t) no-repeat,
|
||||
conic-gradient(at var(--r) var(--r),#000 75%,#0000 0)
|
||||
calc(var(--r)/-2) calc(var(--r)/-2) padding-box,
|
||||
radial-gradient(50% 50%,#000 98%,#0000 101%)
|
||||
0 0/var(--r) var(--r) space padding-box;
|
||||
background: #333 border-box;
|
||||
color: #fff;
|
||||
}
|
||||
.chat-msg-left {
|
||||
--_d: 0%;
|
||||
border-left: var(--t) solid #0000;
|
||||
margin-right: var(--t);
|
||||
place-self: start;
|
||||
}
|
||||
.chat-msg-right {
|
||||
--_d: 100%;
|
||||
border-right: var(--t) solid #0000;
|
||||
margin-left: var(--t);
|
||||
place-self: end;
|
||||
}
|
||||
.chat-history {
|
||||
margin: 0;
|
||||
min-height: 10vh;
|
||||
display: grid;
|
||||
place-content: left;
|
||||
width: 100%;
|
||||
gap: 10px;
|
||||
font-family: system-ui, sans-serif;
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
/* Model settings form styling */
|
||||
.rjsf fieldset {
|
||||
border-style: none;
|
||||
|
@ -1,12 +1,9 @@
|
||||
flask>=2.2.3
|
||||
flask[async]
|
||||
flask_cors
|
||||
flask_socketio
|
||||
requests
|
||||
openai
|
||||
python-socketio
|
||||
dalaipy==2.0.2
|
||||
gevent-websocket
|
||||
urllib3==1.26.6
|
||||
anthropic
|
||||
google-generativeai
|
||||
|
Loading…
x
Reference in New Issue
Block a user