Skip to content

stealthcode/bookscrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bookscrape

A tool for analyzing narrative stories using OpenAI APIs, Pinecone Vectore database, and Langchain RAG pipeline to retrieve information from long text media.

Goal

Running this program will take the text contents of a book (or other narrative based text media) and produce a graph datastructure of the characters in the story and summaries of their interactions. This analysis tool could be used to visualize sumaries of character dynamics over time when using a graph database such as Neo4j.

Usage

To run the program, you must first set the environment variables for the APIs used and the asset files to be analyzed.

Env Var Value Description
OPENAI_API_KEY OpenAI API key for accessing the Embedding and Chat APIs
LANGCHAIN_API_KEY Langchain API key for RAG retrieval
PINECONE_API_KEY API key for Pinecone vector datastore
ASSET_FILE_PATH File path to the text to be analyzed
ASSET_TITLE Name of the work to be analyzed (for metadata)

Currently the parser assumes that the file is preprocessed into the format of each chapter starting with a line matching regex /CHAPTER [IVX\d]*./.

$ yarn run build && yarn run start

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published