In the evolving landscape of artificial intelligence and natural language processing, Retrieval-Augmented Generation (RAG) has emerged as a powerful technique for generating highly relevant and context-aware responses by combining retrieval mechanisms with generative models. Azure AI Search service, with its advanced capabilities, provides an ideal platform for implementing RAG. This blog will guide you through using Azure AI Search to create, manage, and query indexes for a robust RAG solution, including full-text search, vector search, hybrid search, and BM25 methodologies, along with scoring profiles.
What is Azure AI Search?
As developers using Microsoft Azure, we can use the Azure AI Search service, a cloud-based search-as-a-service solution. This service allows us to integrate advanced search functionalities into our applications without dealing with complex infrastructure or needing specialized expertise. By leveraging AI and natural language processing (NLP) technologies, Azure AI Search helps users find the information they need quickly and intuitively.
Advanced Features of Azure AI Search
- Full-Text Search: Azure AI Search supports full-text search across large volumes of structured and unstructured data, allowing users to search for keywords or phrases within documents, web pages, and databases.
- Natural Language Processing: With built-in NLP capabilities, Azure AI Search understands user queries in natural language and returns relevant results based on semantic understanding, synonyms, and context.
- Customizable Relevance Ranking: Developers can fine-tune search results using customizable relevance ranking algorithms, ensuring that the most relevant content appears at the top of the search results.
- Faceted Navigation: Azure AI Search enables faceted navigation, allowing users to refine search results based on predefined categories, attributes, or metadata.
- Geo-Spatial Search: Geo-spatial search capabilities enable location-based searching, making it ideal for applications that require location-aware functionality, such as mapping services or store locators.
- Analytics and Monitoring: Azure AI Search provides insights into user search behavior, query performance, and search result effectiveness through built-in analytics and monitoring tools.
Implementation Guide
Now that we've covered the basics let's dive into the implementation process for the Azure AI Search service.
- Create an Azure Search Service: We create an Azure Search service in the Azure portal. We choose a unique name, select the appropriate pricing tier based on our requirements, and specify the location for our search service.
- Define Indexes: We define indexes to set the data structure we want to search. This involves specifying fields, data types, and search behaviors, such as full-text or geo-spatial searches.
- Ingest Data: We ingest data into Azure AI Search using data ingestion pipelines or APIs once the indexes are defined. Azure AI Search supports various data sources, including Azure Blob Storage, Azure SQL Database, Cosmos DB, and more.
- Querying Data: We implement search functionality in our application using the Azure Search REST API, .NET SDK, or other supported SDKs and client libraries. We construct queries based on user input and retrieve relevant search results from Azure AI Search.
- Customize Search Experience: We fine-tune the search experience by customizing relevance ranking, implementing faceted navigation, and leveraging other advanced features of Azure AI Search to meet our application's specific requirements.
- Monitor and Optimize: We continuously monitor search performance, user engagement, and query effectiveness using built-in analytics and monitoring tools. We use the insights gathered to optimize search relevance, improve user experience, and drive business outcomes.
Creating the Azure AI Search Service
To get started, we need to create an Azure AI Search service. This service will act as the foundation for indexing and querying your data.
- Create Search Service:
- Go to the Azure portal.
- Navigate to "Create a resource" and search for "Azure Cognitive Search."
- Choose the subscription, create a new resource group, and provide a unique name for the search service.
- Select a suitable pricing tier based on our needs and create the service.
Creating and Managing Indexes
Indexes are the core of the Azure AI Search service, defining the schema for your searchable content.
1. Define Index:
We have to install the Python Azure AI Search SDK. We'll then add the required library imports to our Python script. Get the service endpoint and an admin API key of the search service created. Next, we'll configure the client to communicate with our Azure AI Search system using an admin API key and the service endpoint. We'll define the index schema, which includes the data types of the fields. Finally, we'll use the create_index method to construct the index in Azure AI Search. Below is the code snippet.
# install azure ai search
!pip install azure-search-documents==11.4.0
# importing necessary libraries
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.core.credentials import AzureKeyCredential
# Configure dev environment variables
service_endpoint = "https://<your-search-service-name>.search.windows.net"
key = "your-search-service-key"
credential = AzureKeyCredential(key)
# define Search index client
index_client = SearchIndexClient(
endpoint=service_endpoint, credential=credential)
fields = [
SearchableField(name="title", type=SearchFieldDataType.String,
searchable=True, retrievable=True,filterable = True),
SimpleField(name="page_no", type=SearchFieldDataType.String, searchable=True, retrievable=True,filterable = True),
SimpleField(name="token_len_section", type=SearchFieldDataType.Double, searchable=True, retrievable=True,filterable = True),
SearchableField(name="section", type=SearchFieldDataType.String,
searchable=True, retrievable=True),
SearchableField(name=" category", type=SearchFieldDataType.String),
SearchField(name="section_vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
searchable=True, dimensions=1536, vector_search_configuration="my-vector-config"),
SearchableField(name="id", type=SearchFieldDataType.String, key=True, searchable=True, retrieval=True, filterable = True),
SearchableField(name="LOAD_FILE_TIME", type=SearchFieldDataType.String,
searchable=True, retrievable=True, filterable = True),
SearchableField(name="expiry_date", type=SearchFieldDataType.String,
searchable=True, retrievable=True, filterable = True),
SimpleField(name="active_flag", type=SearchFieldDataType.Double, retrievable=True, filterable = True),
SimpleField(name="repeat_flag", type=SearchFieldDataType.Double, retrievable=True, filterable = True),
SimpleField(name="batch_number", type=SearchFieldDataType.Double, retrievable=True, filterable = True),
],
vector_search = VectorSearch(
algorithm_configurations=[
VectorSearchAlgorithmConfiguration(
name="my-vector-config",
kind="hnsw",
hnsw_parameters={
"m": 4,
"efConstruction": 400,
"efSearch": 500,
"metric": "cosine"
}
)
]
)
# Create the search index with the semantic settings
index = SearchIndex(name=index_name, fields=fields,
vector_search=vector_search)
result = index_client.create_index(index)
print(f' {result.name} created')
2. Load Data into Index:
We can use the ‘upload_documents’ function to load the data into Azure AI index created. First, we read the JSON file prepared in the format of the index schema created and load its contents into a Python variable. Initialize a SearchClient to interact with a search service. Uploads the loaded documents to a specified index in the search service. Finally, a print statement gives the number of uploaded documents. Use the Azure Search SDK or REST API to upload data. Below is an example using Python and the Azure SDK.
with open(r"path_of_json_file.json", 'r') as file:
documents = json.load(file)
#print(documents)
#print(documents)
search_client = SearchClient(endpoint=service_endpoint, index_name= "index_name", credential=credential)
result = search_client.upload_documents(documents)
print(f"Uploaded {len(documents)} documents")
Extracting Data from Index
To enhance retrieval, Azure AI Search features hybrid search and semantic ranking. Hybrid search combines keyword and vector retrieval and uses Reciprocal Rank Fusion (RRF) to select the best results from each method. The semantic ranker then performs a secondary ranking on the initial BM25-ranked or RRF-ranked results. This secondary ranking leverages multi-lingual, deep-learning models to highlight the most semantically relevant outcomes. Enabling the semantic ranker is simple—just update the "query_type" parameter to "semantic". Using a semantic ranker with hybrid search within the Azure AI Search stack is the most effective way to enhance relevance right out of the box. To retrieve data from the index, we can use various query types provided by Azure AI Search.
1. Full-Text Query Search:
Full text search is an approach to information retrieval that matches the plain text stored in an index. A full text query is specified in a search parameter and consists of terms, quote-enclosed phrases, and operators. More definition is added to the request by other parameters like filter, select, top etc. Perform a full-text search using the search method. Below is an example using Python and the Azure SDK.
# Full text search (e.g 'tools for software development' in Dutch)
query = "tools voor softwareontwikkeling"
search_client = SearchClient(service_endpoint, index_name, AzureKeyCredential(key))
results = search_client.search(
search_text= query,
select=["title", "content", "category"]
)
for result in results:
print(f"Title: {result['title']}")
print(f"Content: {result[‘section’]}")
print(f"Category: {result['category']}\n")
2. Vector Search:
A vector search is a method of searching for information in various data types, including images, audio, text, video, and more. It is based on the similarity of numerical representations of data, called vector embeddings, which determines search results. The efficiency of the vector search to retrieve appropriate information depends on whether an efficient embedded model can extract meaning from documents and queries in their resultant vectors. This is essential for RAG implementations where embeddings are used. Below is an example of generating embeddings for the query and performing vector search using Python and the Azure SDK.
# Function to generate embeddings for title and content fields, also used for query embeddings
import openai
def generate_embeddings(text):
response = openai.Embedding.create(
input=text, engine= "text-search-babbage-doc-001") #"text-search-curie-doc","text-embedding-ada-002"
embeddings = response['data'][0]['embedding']
return embeddings
# Pure Vector Search multi-lingual (e.g 'tools for software development' in Dutch)
query = "tools voor softwareontwikkeling"
search_client = SearchClient(service_endpoint, index_name, AzureKeyCredential(key))
results = search_client.search(
search_text="",
vector=Vector(value=generate_embeddings(query), k=3, fields="section_vector"),
select=["title", "content", "category"]
)
for result in results:
print(f"Title: {result['title']}")
print(f"Content: {result[‘section’]}")
print(f"Category: {result['category']}\n")
3. Hybrid Search (Vector + Full-Text):
In Azure AI Search, hybrid search combines full-text and vector queries into a single search request, which executes the queries in parallel and merges the results in the response. The response is ordered by search score and uses Reciprocal Rank Fusion (RRF) to select the most relevant matches from each query. Combine vector and full-text search for more robust results. Below is an example using Python and the Azure SDK.
# Hybrid Search
query = "scalable storage solution"
search_client = SearchClient(service_endpoint, index_name, AzureKeyCredential(key))
results = search_client.search(
search_text=query,
vector=Vector(value=generate_embeddings(query), k=3, fields="section_vector"),
select=["title", "content", "category"],
top=3
)
for result in results:
print(f"Title: {result['title']}")
print(f"Content: {result[‘section’]}")
print(f"Category: {result['category']}\n")
Implementing BM25 Methodologies
BM25 is a popular ranking function used to rank documents based on the query terms appearing in each document.
1. BM25 Scoring Profile:
In determining the relevance of a document for this query, BM25 shall take into account both the normalization of term frequency TFO and the length of documents. It is based on a probabilistic retrieval framework, which presumes that the relevant and nonrelevant documents are affected by different statistical distributions. On a per-index basis, we can opt for the BM25 algorithm and select the algorithm on a per-index basis; the parameter k and b1 values are to be configured while creating an index. k1 - Controls non-linear term frequency normalization (saturation). b - Controls to what degree document length normalizes tf values. Both BM25 and Classic are TF-IDF-like retrieval functions that use the term frequency (TF) and the inverse document frequency (IDF) as variables to calculate relevance scores for each document-query pair, which is then used for ranking results.
# define index schema and parameters along with fields in below format
'fields': [ {'name': 'sample_field_name',
'type': 'Collection(Edm.Single)',
'searchable': True,
'filterable': False,
'retrievable': True,
'sortable': False,
'facetable': False,
'key': False,
'indexAnalyzer': None,
'searchAnalyzer': None,
'analyzer': None,
'dimensions': 1536,
'vectorSearchProfile': 'myHnswProfile',
'synonymMaps': []}],
'scoringProfiles': [],
'corsOptions': None,
'suggesters': [],
'analyzers': [],
'tokenizers': [],
'tokenFilters': [],
'charFilters': [],
'encryptionKey': None,
'similarity': {'@odata.type': '#Microsoft.Azure.Search.BM25Similarity',
'k1': 3,
'b': 0.75},
'semantic': None,
'vectorSearch': {'algorithms': [{'name': 'myHnsw',
'kind': 'hnsw',
'hnswParameters': {'metric': 'cosine',
'm': 4,
'efConstruction': 400,
'efSearch': 500},
'exhaustiveKnnParameters': None}],
'profiles': [{'name': 'myHnswProfile', 'algorithm': 'myHnsw'}]}}
# create rest-api call based index creation
# Define headers
headers = {
'Api-Version': api_version, # Pass API version
'Api-Key': api_key, # Pass API key
'Content-Type': content_type # Specify content type as JSON
}
json_body = eval(json_body)
json_body['name'] = index_name
# Make the POST request
response = requests.post(url=service_url, headers=headers, json=json_body)
# Print response
print("reason:", response.reason, "status_code: ", response.status_code,"text: ", response.text)
Adding Scoring Profiles
A scoring profile is the criteria we can use to change the default ranking in Azure AI Search. We use a scoring profile to model relevance by looking at document values. For example, a weighted field makes matches in a “tags” field more relevant to the exact match in a “descriptions” field. A newness function to improve the score in Azure AI Search by looking at clinical trials with a lastUpdateCreated date within the past year. We can sort documents based on their priority score using a priority metadata field. Scoring profiles allow customization of search result rankings based on various criteria.
1. Define Scoring Profile:
Create a scoring profile in the index definition. And we can update the scoring profile of the index.
# Initialize the SearchIndexClient
index_client = SearchIndexClient(endpoint=service_endpoint, index_name=index_name, credential=credential)
# Define a scoring profile with text weights
scoring_profile = ScoringProfile(
name="section-weight-boost",
text_weights=TextWeights(weights={"title": 1.5, "section": 5})
)
# Retrieve the current index definition to get the ETag
index = index_client.get_index(index_name)
# Add or update the scoring profile in the index
index.scoring_profiles = [scoring_profile] # Replace with the new scoring profile
# Update the index with the new scoring profile, using the ETag for concurrency control
index_response = index_client.create_or_update_index(index, match_condition='IfMatch')
# If you need to see the response, you can print it out
print(index_response)
2. Use Scoring Profile in Query:
Apply the scoring profile to your search queries. We can choose the fields that we want to fetch from the index fields, specify the specific field on which we want to perform vector search, and specify the field on which we need to perform a full-text search. We can restrict the number of fetched records by passing a parameter called 'top'. This parameter limits the number of retrieved records to a given value.
results = client.search(
search_text=query,
# search_text = '',
vector=Vector(value=generate_embeddings(query),
k= 20, #3,
fields="revised_section_embed_vector"
),
query_type='semantic',
query_language="en-us", semantic_configuration_name='my-semantic-config',
filter=f"active_flag eq 1" ,
top=20,
select= ['title', ‘section', ‘category’] #
)
Conclusion
Azure AI Search service empowers developers to deliver fast, relevant, and personalized application search experiences. By leveraging the power of AI and NLP, Azure AI Search enables users to find the information they need quickly and intuitively, driving user engagement and business success. With this comprehensive implementation guide, you're equipped to harness the full potential of Azure AI Search and unlock new possibilities for your applications.
Elevate your RAG systems with Azure AI Search Service enhancements! Discover innovative data science services that optimize search capabilities. Contact us for a consultation today!
AUTHOR - FOLLOW
Akshaykumar Torangatti
Consultant, Data Science
Topic Tags