LLM Pipelines with Pinecone and HuggingFace with Python and Apache NiFi
This is part two of Vector Databases, LLM and Apache NiFi:
The NiFi Python processor for Pinecone requires OpenAI for transforming the data for vector storage.
Pinecone Vector Database Free Tier
It was easy to use Chroma, but since we have Pinecone as an option I used that as well. Same flow, everything equally fast. Like the processor for Chroma I wanted to upgrade, so I upgraded that. I did a pull request, so this will go into mainstream soon.
We sign up for a free Pinecone account and create an index (nifi). Now we can easily send data there. What is nice with Pinecone is I can see things being stored. This is helpful to make sure everything is working.
We will need to take our API Key from Pinecone to put in the query and put processors. We will also need to enter your OpenAI key which is needed for encoding and tokenizing your data.