Pinecone indexes and searches vector representations of data to find items that are similar to the query. You can index billions of items in real-time and search for the closest matches, with millisecond latency.
Pinecone can be used through its REST API, which allows developers to create an index, insert their vector data, and query the index to find similar vectors to the input. Vectors provide a representation of your data in a N-dimensional space and may be associated to metadata for ease of interpretation and filtering. Briefly, vector search is based on calculating the distance between two vectors and applying an algorithm to find nearest vectors, such as K Nearest Neighbors and Approximate Neighbor Search.
Pinecone 2.0 extends filtering capabilities by supporting single-stage filtering, which makes it possible to specify arbitrary filters on metadata and to retrieve the number of nearest-neighbors that match them. Additionally, Pinecone 2.0 introduces hybrid storage to handle cases where billion items cannot fit into RAM memory.
Besides new features, Pinecone 2.0 also introduces a new architecture and a new REST API based on OpenAPI.
InfoQ has taken the chance to speak with Edo Liberty, founder and CEO of Pinecone.
InfoQ: What is the importance of vector similarity search for today’s applications?
Liberty: In short, more relevant search results and recommendations lead to more effective applications, whether you measure effectiveness by user engagement, revenue, customer satisfaction, operational efficiency, or anything else.
Information retrieval is a core function in many applications such as search, recommendation, data management, and security systems. Many companies’ growth and revenue are impacted by how quickly, accurately, and reliably they can search through their data.
Vector similarity search — the new method of search which takes advantage of advances in Deep Learning — has proven itself at companies like Google, Microsoft, Facebook, and Amazon.
If you’ve recently marveled at the personalized product recommendations from Amazon, the sublime music recommendations from Spotify, the mystifyingly relevant search results from Google/Bing, or the can’t-look-away activity feeds from Facebook/LinkedIn/Twitter, then you’ve experienced the difference made by vector search. Those companies noticed the difference too, in their revenue and engagement numbers.
The success of vector search at some of the biggest consumer companies has raised the stakes for everyone else. Users now expect better recommendations and search results from all the companies they interact with, from social apps to workplace software.
However, recognizing the importance of vector search only gets you so far. Beyond a handful of tech hyperscalers who already have it, even large enterprise companies can struggle to implement vector search in production. For example: Enterprise software companies can make their users more productive by helping them find what they need quickly, but not if it creates a laggy experience; media platforms want to provide better content recommendations to drive engagement and retention, but only if it works as fast as their users can scroll. This is why it’s equally important to find a path for taking vector search from the lab to production.
InfoQ: What are Pinecone’s majors strengths in comparison to alternative solutions?
Liberty: We’re making something for the product and ML teams that want to deploy vector search into production quickly and then scale it without incurring high operational costs. As a result, we differ in a few ways from the alternatives:
The first is production readiness. With one API call, a Pinecone user can spin up a vector search service with critical features such as metadata filtering, CRUD operations, live index updates, and horizontal scaling. There’s no infrastructure to build and maintain, no nearest-neighbor algorithms to tune, and no development work needed.
Next is high performance at scale. Similarity search is a computationally-intensive process, but you might not notice that with a small dataset. It’s when you scale to a dataset of 10M, 100M, 1B (and higher) where you start seeing latencies skyrocket and reliability plummet if you’re not using a system specifically designed to run similarity search at those levels. In Pinecone we spend more time on the architecture and engineering of the distributed system than anything else, so that when users scale from 1M to 100B items they don’t notice a thing.
This system is fully managed by Pinecone, running on our multi-tenant or dedicated environments. We obsess over operations and security so the user doesn’t have to. Every user gets the benefit of an expert team ensuring high performance, high availability, and support.
And finally, this costs up to 10x less than other manages services or the infrastructure costs of self-hosting an open-source solution. Vector searches typically run completely in-memory (RAM). For many companies with over a billion items in their catalog, the memory costs alone could make vector search too expensive to consider. Some vector search libraries have the option to store everything on-disk, but this could come at the expense of search latencies becoming unacceptably high. Pinecone offers a hybrid storage configuration, which provides the same fast and accurate search results yet dramatically cuts infrastructure costs.
InfoQ: Could you provide some glimpses of Pinecone’s roadmap?
Liberty: Sure. We are laser-focused on making it easy for ML teams of any size (and with any data size) to integrate vector search into their search and recommender systems.
In the short term that means continuing to improve scalability, availability, deployment options, affordability, observability, security (and compliance), the REST API and its clients, and additional features related to handling millions or billions of records.
Pinecone v2 provides a free trial to try its possibilities out and a pay-per-use pricing model for production deployments.