Crawl websites & upload documents to Pinecone with OpenAI or Gemini embeddings
Loading...
Per-project rules for recognizing identifying values (SKUs, part numbers, internal codes) on crawled pages. Matched values are written to vector metadata so retrieval can filter by them.
One per line. Leave blank to use built-in defaults.
One per line. Matches "Label: VALUE" patterns in page text.
Path fragments that precede an identifier in the URL.
Each named regex emits its own match_<name> metadata field.
Names are sanitized (lowercase, alphanumeric+underscore).