Implementing Tool Calling in Large Language Models (LLMs)
Implementing tool calling in Large Language Models (LLMs) significantly enhances their capabilities by enabling interaction with external tools and APIs. This integration allows LLMs to perform tasks beyond text generation, such as executing code, retrieving real-time data, and automating complex workflows.
Understanding Tool Calling in LLMs
Tool calling refers to the ability of LLMs to invoke external functions or APIs during their operation. This functionality extends the model’s utility by allowing it to:
- Access External Data: Retrieve up-to-date information from the internet or databases.
- Execute Code: Perform computations or run scripts to solve specific problems.
- Automate Tasks: Interact with other software systems to complete tasks like scheduling or data entry.
Core Components of Tool Calling
- Tool Definition: Tools are defined as functions with specified input and output schemas. In frameworks like LangChain, the
@tool
decorator facilitates this process. - Tool Binding: Integrate these tools with the LLM to make the model aware of the available functions and their schemas. This integration allows the model to generate responses that align with the tool’s input requirements.
- Tool Invocation: When appropriate, the LLM can decide to call a tool, ensuring its response conforms to the tool’s input schema. This enables the model to perform specific tasks, such as calculations or data retrieval, by leveraging external functions.
Implementing Tool Calling: A Step-by-Step Guide
- Define the Tool: Create a function that performs a specific task, such as fetching real-time data or executing a calculation. Use appropriate decorators or definitions to specify input and output schema.
from langchain_core.tools import tool
from models import pipeline
from langchain_community.tools import DuckDuckGoSearchResults
from langchain_core.messages.tool import ToolMessage
import tempfile
@tool
def make_images_api(image_text_prompt: str) -> str:
"""
Generate an image based on the text prompt using the pipeline and save it to a temporary path and return the save path of image.
Args:
image_text_prompt (str): The text prompt for the image generation.
Returns:
str: Path to the saved image.
"""
image = pipeline(prompt=image_text_prompt).images[0]
with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as temp_file:
temp_path = temp_file.name
image.save(temp_path, format="PNG")
return temp_path
@tool
def search_in_web(users_query: str):
"""
Perform a web search using DuckDuckGo and return the search results.
This function uses the DuckDuckGoSearchResults tool to execute a web search
based on the provided query. It prints the search results and returns them
for further processing.
Args:
users_query (str): The search query to be performed on the web.
Returns:
The search results from DuckDuckGo, typically a list or dict containing
search result information.
Example:
>>> search_results = search_in_web("Python programming")
# This will print the search results and return them
"""
result = DuckDuckGoSearchResults()
res = result.invoke(users_query)
return res
tool_mapping = {
"make_images_api" : make_images_api,
"search_in_web" : search_in_web
}
2. Bind the Tool to the LLM: Utilise methods provided by your AI framework to bind the tool to the model, enabling the LLM to recognize and utilize the function when generating responses.
3. Model Invocation: When the LLM encounters a query requiring the tool’s functionality, it generates a response that includes a call to the tool with the necessary arguments.
4. Tool Execution: The specified tool executes with the provided arguments, and the result is integrated into the LLM’s final response to the user.
from langchain_openai import ChatOpenAI
from dataclasses import dataclass
from langchain_core.messages import HumanMessage, SystemMessage
from langchain.prompts import ChatPromptTemplate
llm = ChatOpenAI(model="gpt-3.5-turbo")
llm_with_tools = llm.bind_tools([make_images_api, search_in_web])
@dataclass
class ChatResponse:
question : str
answer : str
document_path : str
system_prompt = SystemMessage(content="""
You are an intelligent assistant capable of handling a variety of tasks. Your main goal is to understand the user's query and act accordingly.
If the user asks to generate an image, make sure to call the appropriate image generation tools. If the user provides a file path for image extraction or processing, follow these steps for efficient extraction:
1. **Detect image generation requests:** If the user asks to generate or create an image, check if the description is provided. Use the appropriate tool to generate the image, and return the generated file path as:
STRICTLY use: FILE_PATH: "<path_to_generated_image>" format no other format
2. **Handle image file paths:** If the user provides a file path (e.g., text responses, image paths, or specific directories for processing), extract and respond with the file path efficiently. Ensure that the path is clear and formatted correctly for further processing.
3. **Understand user intent:** Ensure that you correctly interpret whether the query is about generating an image, querying an existing one, or dealing with file paths. Always ask for clarification if necessary.
""")
def answer_question(user_question: str, is_web_needed : bool = False) -> ChatResponse:
current_tools = ""
messages = [system_prompt, HumanMessage(user_question)]
ai_msg = llm_with_tools.invoke(messages)
messages.append(ai_msg)
if ai_msg.tool_calls is not None:
for tool_call in ai_msg.tool_calls:
selected_tool = tool_mapping[tool_call["name"].lower()]
current_tools = tool_call["name"].lower()
tool_msg = selected_tool.invoke(tool_call)
messages.append(tool_msg)
res = llm.invoke(messages)
if current_tools.lower() == "make_images_api":
return handle_image_response(res=res.content, user_question=user_question)
elif current_tools.lower() == "search_in_web":
return ChatResponse(question=user_question, answer=res.content, document_path="")
else:
answer = ai_msg.content
return ChatResponse(question=user_question, answer=answer, document_path="")
Benefits of Tool Calling in LLMs
- Enhanced Capabilities: LLMs can perform tasks beyond their training data, such as real-time information retrieval and complex computations.
- Improved Accuracy: By leveraging external tools, LLMs can provide more precise and reliable responses, especially in domains requiring up-to-date information.
- Task Automation: Enables the automation of complex workflows by allowing LLMs to interact with various software systems and APIs.
Challenges and Considerations
- Security: Ensure that tool calling mechanisms are secure to prevent unauthorised access or execution of malicious code.
- Error Handling: Implement robust error handling to manage cases where tool execution fails or returns unexpected results.
- Performance: Monitor the performance impact of tool calling, as frequent external calls may introduce latency.
Conclusion
Implementing tool calling in LLMs significantly enhances their functionality, enabling them to perform a broader range of tasks and provide more accurate responses. By following the outlined steps and considering the associated challenges, developers can effectively integrate tool calling into their AI applications, leading to more versatile and powerful language models.
Resources
Here is the resources I used for making the article, Ihope this will help
- https://python.langchain.com/v0.1/docs/modules/tools/
- https://python.langchain.com/v0.1/docs/use_cases/tool_use/
- https://github.com/langchain-ai/langchain/blob/master/docs/docs/how_to/tool_calling.ipynb
Thanks for reading! My name is Abhishek, I have an passion of building app and learning new technology.