xlog-Archivist: Making it easier to add articles to the knowledge base

May 20, 2023#Ai #教程 #技术574

AI Translation

This post is translated from Chinese into English through AI.View Original

AI-generated summary

In the previous article, we discussed how to add blog content to a knowledge base to enable the ChatGPT bot to respond based on the article. However, manually copying the content is inefficient and inconvenient. To solve this problem, a project called xlog-Archivist was developed. It is a tool that automatically crawls article content and URLs from xlog-based blogs. With xlog-Archivist, we no longer need to manually copy the content as it can automatically retrieve and export the content in JSON format. This simplifies the process of acquiring knowledge and allows the ChatGPT bot to learn and accumulate knowledge more quickly. The usage instructions involve cloning the project, installing dependencies, and running the program. After the first run, the blog URL will be saved in the configuration file for future use.

In the previous article, we introduced how to add blog content to the knowledge base so that the ChatGPT bot can reply based on the article content. However, the method of manually copying the article content is inefficient and very inconvenient to use.

Embedding ChatGPT on a blog to read articles

To solve this problem, I developed a project called xlog-Archivist.
xlog-Archivist is a tool for automatically crawling article content and URLs from xlog-based blogs. With xlog-Archivist, we don't need to manually copy the article content. It can automatically retrieve the article content and export it in JSON format, making it easy for us to migrate blog content to the ChatGPT knowledge base. This greatly simplifies the process of knowledge acquisition, allowing the ChatGPT bot to learn and accumulate knowledge more quickly based on fresh content.

Usage#

Clone the project to your local machine or download the zip file
```
git clone https://github.com/endercatone/xlog-Archivist.git
```
Install dependencies
```
pip install requests
```
Run the program
```
python main.py
```

Now you should be able to find the articles and url.txt in the articles directory.

After the first run, your blog URL will be saved in the configuration file, so you don't need to enter the URL next time.