🤗Hugging Face Transformers Agent | by Sophia Yang

Comparisons with 🦜🔗LangChain Agent

Just two days ago, 🤗Hugging Face released Transformers Agent — an agent that leverages natural language to choose a tool from a curated collection of tools and accomplish various tasks. Does it sound familiar? Yes, it does because it’s a lot like 🦜🔗LangChain Tools and Agents. In this blog post, I will cover what Transformers Agent is and its comparisons with 🦜🔗LangChain Agent.

You can try out the code in this colab (provided by Hugging Face).

In short, it provides a natural language API on top of transformers: we define a set of curated tools and design an agent to interpret natural language and to use these tools.

I can imagine engineers at HuggingFace be like: We have so many amazing models hosted on HuggingFace. Can we integrate those with LLMs? Can we use LLMs to decide which model to use, write code, run code, and generate results? Essentially, nobody needs to learn all the complicated task-specific models anymore. Just give it a task, LLMs (agents) will do everything for us.

Here are the steps:

Source: https://huggingface.co/docs/transformers/transformers_agents

Instruction: the prompt users provide
Prompt: a prompt template with the specific instruction added, where it lists multiple tools to use.
Tools: a curated list of transformers models, e.g., Flan-T5 for question answering,
Agent: an LLM that interprets the question, decides which tools to use, and generates code to perform the task with the tools.
Restricted Python interpreter: execute Python code.

Step 1: Instantiate an agent.

Step 1 is to instantiate an agent. An agent is just an LLM, which can be an OpenAI model, a StarCoder model, or an OpenAssistant model.

The OpenAI model needs the OpenAI API key and the usage is not free. We load the StarCoder model and the OpenAssistant model from the HuggingFace Hub, which requires HuggingFace Hub API key and it is free to use.

from transformers import HfAgent# OpenAI
agent = OpenAiAgent(model="text-davinci-003", api_key="<your_api_key>")
from transformers import OpenAiAgent
from huggingface_hub import login
login("<YOUR_TOKEN>")
# Starcoder
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
# OpenAssistant
agent = HfAgent(url_endpoint="https://api-inference.huggingface.co/models/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5")

Step 2: Run the agent.

agent.run is a single execution method and selects the tool for the task automatically, e.g., select the image generator tool to create an image.

agent.chat keeps the chat history. For example, here it knows we generated a picture earlier and it can transform an image.

Transformers Agent is still experimental. It’s a lot smaller scope and less flexible. The main focus of Transformers Agent right now is for using Transformer models and executing Python code, whereas LangChain Agent does “almost” everything. Let be break it down to compare different components between Transformers and LangChain Agents:

Tools

🤗Hugging Face Transfomers Agent has an amazing list of tools, each powered by transformer models. These tools offer three significant advantages: 1) Even though Transformers Agent can only interact with few tools currently, it has the potential to communicate with over 100,000 Hugging Face model. It possesses full multimodal capabilities, encompassing text, images, video, audio, and documents.; 2) Since these models are purpose-built for specific tasks, utilizing them can be more straightforward and yield more accurate results compared to relying solely on LLMs. For example, instead of designing the prompts for the LLM to perform text classification, we can simply deploy BART that’s designed for text classification; 3) These tools unlocked capabilities that LLMs alone can’t accomplish. Take BLIP, for example, which enables us to generate captivating image captions — a task beyond the scope of LLMs.
🦜🔗LangChain tools are all external APIs, such as Google Search, Python REPL. In fact, LangChain supports HuggingFace Tools via the load_huggingface_tool function. LangChain can potentially do a lot of things Transformers Agent can do already. On the other hand, Transformers Agents can potentially incorporate all the LangChain tools as well.
In both cases, each tool is just a Python file. You can find the files of 🤗Hugging Face Transformers Agent tools here and 🦜🔗LangChain tools here. As you can see, each Python file contains one class indicating one tool.

Agent

🤗Hugging Face Transformers Agent uses this prompt template to determine which tool to use based on the tool’s description. It asks the LLM to provide an explanations and it provides some few-shots learning examples in the prompt.
🦜🔗LangChain by default uses the ReAct framework to determine which tool to use based on the tool’s description. The ReAct framework is described in this paper. It does not only act on a decision but also provides thoughts and reasoning, which is similar to the explanations Transformers Agent uses. In addition, 🦜🔗LangChain has four agent types.

Custom Agent

Creating a custom agent is not too difficult in both cases:

See the HuggingFace Transformer Agent example towards the end of this colab.
See the LangChain example here.

“Code-execution”

🤗Hugging Face Transformers Agent includes “code-execution” as one of the steps after the LLM selects the tools and generates the code. This restricts the Transformers Agent’s goal to execute Python code.
🦜🔗LangChain includes “code-execution” as one of its tools, which means that executing code is not the last step of the whole process. This provides a lot more flexibility on what the task goal is: it could be executing Python code, or it could also be something else like doing a Google Search and returning search results.

In this blog post, we explored the functionality of 🤗Hugging Face Transformers Agents and compared it to 🦜🔗LangChain Agents. I look forward to witnessing further developments and advancements in Transformers Agent.

. . .

By Sophia Yang on May 12, 2023

Sophia Yang is a Senior Data Scientist. Connect with me on LinkedIn, Twitter, and YouTube and join the DS/ML Book Club ❤️

Comparisons with 🦜🔗LangChain Agent

You can try out the code in this colab (provided by Hugging Face).

In short, it provides a natural language API on top of transformers: we define a set of curated tools and design an agent to interpret natural language and to use these tools.

Here are the steps:

Source: https://huggingface.co/docs/transformers/transformers_agents

Instruction: the prompt users provide
Prompt: a prompt template with the specific instruction added, where it lists multiple tools to use.
Tools: a curated list of transformers models, e.g., Flan-T5 for question answering,
Agent: an LLM that interprets the question, decides which tools to use, and generates code to perform the task with the tools.
Restricted Python interpreter: execute Python code.

Step 1: Instantiate an agent.

Step 1 is to instantiate an agent. An agent is just an LLM, which can be an OpenAI model, a StarCoder model, or an OpenAssistant model.

from transformers import HfAgent# OpenAI
agent = OpenAiAgent(model="text-davinci-003", api_key="<your_api_key>")
from transformers import OpenAiAgent
from huggingface_hub import login
login("<YOUR_TOKEN>")
# Starcoder
agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcoder")
# OpenAssistant
agent = HfAgent(url_endpoint="https://api-inference.huggingface.co/models/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5")

Step 2: Run the agent.

agent.run is a single execution method and selects the tool for the task automatically, e.g., select the image generator tool to create an image.

agent.chat keeps the chat history. For example, here it knows we generated a picture earlier and it can transform an image.

Tools

🤗Hugging Face Transfomers Agent has an amazing list of tools, each powered by transformer models. These tools offer three significant advantages: 1) Even though Transformers Agent can only interact with few tools currently, it has the potential to communicate with over 100,000 Hugging Face model. It possesses full multimodal capabilities, encompassing text, images, video, audio, and documents.; 2) Since these models are purpose-built for specific tasks, utilizing them can be more straightforward and yield more accurate results compared to relying solely on LLMs. For example, instead of designing the prompts for the LLM to perform text classification, we can simply deploy BART that’s designed for text classification; 3) These tools unlocked capabilities that LLMs alone can’t accomplish. Take BLIP, for example, which enables us to generate captivating image captions — a task beyond the scope of LLMs.
🦜🔗LangChain tools are all external APIs, such as Google Search, Python REPL. In fact, LangChain supports HuggingFace Tools via the load_huggingface_tool function. LangChain can potentially do a lot of things Transformers Agent can do already. On the other hand, Transformers Agents can potentially incorporate all the LangChain tools as well.
In both cases, each tool is just a Python file. You can find the files of 🤗Hugging Face Transformers Agent tools here and 🦜🔗LangChain tools here. As you can see, each Python file contains one class indicating one tool.

Agent

🤗Hugging Face Transformers Agent uses this prompt template to determine which tool to use based on the tool’s description. It asks the LLM to provide an explanations and it provides some few-shots learning examples in the prompt.
🦜🔗LangChain by default uses the ReAct framework to determine which tool to use based on the tool’s description. The ReAct framework is described in this paper. It does not only act on a decision but also provides thoughts and reasoning, which is similar to the explanations Transformers Agent uses. In addition, 🦜🔗LangChain has four agent types.

Custom Agent

Creating a custom agent is not too difficult in both cases:

See the HuggingFace Transformer Agent example towards the end of this colab.
See the LangChain example here.

“Code-execution”

🤗Hugging Face Transformers Agent includes “code-execution” as one of the steps after the LLM selects the tools and generates the code. This restricts the Transformers Agent’s goal to execute Python code.
🦜🔗LangChain includes “code-execution” as one of its tools, which means that executing code is not the last step of the whole process. This provides a lot more flexibility on what the task goal is: it could be executing Python code, or it could also be something else like doing a Google Search and returning search results.

. . .

By Sophia Yang on May 12, 2023

Sophia Yang is a Senior Data Scientist. Connect with me on LinkedIn, Twitter, and YouTube and join the DS/ML Book Club ❤️

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – admin@technoblender.com. The content will be deleted within 24 hours.

🤗Hugging Face Transformers Agent | by Sophia Yang | May, 2023

Comparisons with 🦜🔗LangChain Agent

Step 1: Instantiate an agent.

Step 2: Run the agent.

Tools

Agent

Custom Agent

“Code-execution”

Comparisons with 🦜🔗LangChain Agent

Step 1: Instantiate an agent.

Step 2: Run the agent.

Tools

Agent

Custom Agent

“Code-execution”

Related Posts