laitimes

Microsoft open-sourced ChatGPT-based, super text-coded agent

author:Not bald programmer
Microsoft open-sourced ChatGPT-based, super text-coded agent

With the emergence of generative AI products such as ChatGPT, large language models have made great progress in application and commercialization. However, it does not perform well when dealing with data analysis, for example, complex data formats such as DataFrame are difficult to directly represent in text, and cannot flexibly meet the needs of different users.

In order to solve these problems, researchers at Microsoft proposed the technical concept of "Code-First" and developed a super code agent, TaskWeaver, based on ChatGPT (GPT-3.5 and above, the author recommends GPT-4).

TaskWeaver can convert a user's natural language text request into Python code that runs in the background, and these codes can call function plug-ins at will to complete professional tasks such as data reading, analysis, and model training.

Open source address: https://github.com/microsoft/TaskWeaver

Paper address: https://arxiv.org/abs/2311.17541

Microsoft open-sourced ChatGPT-based, super text-coded agent

To put it simply, TaskWeaver allows people who don't have programming skills to perform professional code tasks. For example, if we want to do a data analysis project, we need to write a program to retrieve the data from the database and check for outliers in the database.

Microsoft open-sourced ChatGPT-based, super text-coded agent

But you don't know how to program at all, with the help of the TaskWeaver framework, you just need to explain your intent in text, and the agent can do the tedious code and generate visualizations for you.

Planner

First of all, the user's request for data analysis will be sent to the planner module of TaskWeaver for disassembly. The planner is like a commander-in-chief, mainly making an execution plan for the entire mission.

Considering the complexity of the requirements, the task is broken down into simple and straightforward sub-steps. For example, fetching data from the database, drawing result charts, etc., the logical association between steps is analyzed, and the execution order is marked. The main process is as follows:

Microsoft open-sourced ChatGPT-based, super text-coded agent

1) Receive the user's text query, use your own knowledge or enhanced examples to generate an initial execution plan, and indicate the necessary subtask steps.

2) Optimize the initial plan, merge subtasks that depend on each other, reduce the number of calls, and improve efficiency. Finalize an execution plan.

3) Iterate through the various subtasks in the plan, send a query to the code generator, and get the code to be executed.

4) Observe the code execution result, modify the original plan if necessary, and ask the user to provide more information.

5) Repeat steps 3 and 4 until all subtasks are completed. Finally, respond to the user's query in natural language.

Code generator

When the plan is complete, each sub-step in the plan is sent to the code generator one by one, and it reports back to the code that executes the code. A code generator is like an all-purpose "programmer".

According to the instructions issued by the planner, the execution logic of the entire code is automatically designed and the code is written. In order to reduce the repetitive wheel, modules such as plug-ins, examples, code validation, automatic error correction, etc., are also built-in.

Microsoft open-sourced ChatGPT-based, super text-coded agent

At the same time, it encapsulates common functions such as data reading and model training, and can be directly called when generating code.

Code Executor

Once the code is written, the information is transferred to the code executor module. Code executors are primarily responsible for loading code and executing it. Plug-ins also come into play in this step, allowing external functions to be connected. The executor records the status of this process in detail, such as variable values, encoding logs, intermediate results, etc., to facilitate multiple rounds of in-depth interaction.

If the code fails to execute, the error is reported to the code generator for automatic correction, and the correct code is automatically generated again.

Microsoft open-sourced ChatGPT-based, super text-coded agent

After the code executor completes a round of tasks, the results are sent to the planner to complete a sub-step. The planner then decides to trigger the execution task of the next sub-step, and repeats the above process.

Read on