From 4211747fa409ae413e39f3b77aaee3058675c80b Mon Sep 17 00:00:00 2001 From: DarkCat09 <50486086+DarkCat09@users.noreply.github.com> Date: Thu, 31 Aug 2023 07:14:19 +0000 Subject: [PATCH] =?UTF-8?q?Deploying=20to=20gh-pages=20from=20@=20TxtDot/d?= =?UTF-8?q?ocumentation@eb689aeb7b600715083ec74b8f58a99e45223e8c=20?= =?UTF-8?q?=F0=9F=9A=80?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 404.html | 43 ++++ index.html | 50 +++- search/search_index.json | 2 +- selfhost/index.html | 536 +++++++++++++++++++++++++++++++++++++++ sitemap.xml.gz | Bin 127 -> 127 bytes 5 files changed, 629 insertions(+), 2 deletions(-) create mode 100644 selfhost/index.html diff --git a/404.html b/404.html index 7a9e424..1f5a9e4 100644 --- a/404.html +++ b/404.html @@ -175,6 +175,18 @@ +
+ +
+ + +
+
+ txtdot/txtdot +
+
+
+ @@ -209,6 +221,18 @@ txtdot +
+ +
+ + +
+
+ txtdot/txtdot +
+
+
+ diff --git a/index.html b/index.html index 768f153..1d85b96 100644 --- a/index.html +++ b/index.html @@ -10,6 +10,8 @@ + + @@ -180,6 +182,18 @@ +
+ +
+ + +
+
+ txtdot/txtdot +
+
+
+ @@ -214,6 +228,18 @@ txtdot +
+ +
+ + +
+
+ txtdot/txtdot +
+
+
+ @@ -357,6 +402,9 @@ + + +

Getting Started

What is this

@@ -365,7 +413,7 @@ extracts only useful data including text, links, pictures and tables, and returns it as an HTML page with a minimalistic design optimized for text reading.

txtdot increases the loading speed and reduces client's bandwidth usage -since no unnecessary code and no scripts are transfered. +since no unnecessary code and no scripts are transferred. Also, you won't see any advertisement (unless it's a static picture that is hard to detect as ads). There are no trackers too.

How to use it

diff --git a/search/search_index.json b/search/search_index.json index e8188e8..0c4fede 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Getting Started","text":""},{"location":"#what-is-this","title":"What is this","text":"

txtdot is a proxy that requests the page by the given URL, extracts only useful data including text, links, pictures and tables, and returns it as an HTML page with a minimalistic design optimized for text reading.

txtdot increases the loading speed and reduces client's bandwidth usage since no unnecessary code and no scripts are transfered. Also, you won't see any advertisement (unless it's a static picture that is hard to detect as ads). There are no trackers too.

"},{"location":"#how-to-use-it","title":"How to use it","text":"

txtdot is an open source software, so everyone can host it on his own server. The official instance is txt.dc09.ru, the list of all instances is here.

On the main page, there's a handy form where you can specify a URL, choose an engine and a format for parsed data. On the /get page, \"Home\" button returns you to /, \"Original page\" opens the entered URL in the same window without txtdot proxy.

The latest docs for API endpoints can be found here. For handy JSON API, use /api/parse returning an engine result object (see below). For pure HTML response, use /api/raw-html. Note that both API and browser endpoints on txt.dc09.ru are ratelimited to 2 requests per second.

"},{"location":"#how-it-works","title":"How it works","text":"

This project exists thanks to great Mozilla's Readability.js library. The initial idea was to process HTML with it on the server so the client does not need to download and execute heavy JS, doesn't need to use an adblock.

Readability performs its work very well in most cases. But not always. For example, check any StackOverflow page or Google search results.

So artegoser wrote the basis of the code keeping in mind that we'll extend txtdot with other engines. For now, engines are functions taking a URL as a parameter, returning an object that contains extracted HTML and plain text, page title and language. The object is rendered with ejs template (or, in /api/parse, just sent as JSON).

If an ?engine= parameter wasn't passed, but txtdot found that a specific engine is assigned to the requested domain, for example, \"stackoverflow.com\": stackoverflow, it uses that engine to process the URL. Otherwise, the page is parsed with the engine assigned to * (it's Readability).

"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Getting Started","text":""},{"location":"#what-is-this","title":"What is this","text":"

txtdot is a proxy that requests the page by the given URL, extracts only useful data including text, links, pictures and tables, and returns it as an HTML page with a minimalistic design optimized for text reading.

txtdot increases the loading speed and reduces client's bandwidth usage since no unnecessary code and no scripts are transferred. Also, you won't see any advertisement (unless it's a static picture that is hard to detect as ads). There are no trackers too.

"},{"location":"#how-to-use-it","title":"How to use it","text":"

txtdot is an open source software, so everyone can host it on his own server. The official instance is txt.dc09.ru, the list of all instances is here.

On the main page, there's a handy form where you can specify a URL, choose an engine and a format for parsed data. On the /get page, \"Home\" button returns you to /, \"Original page\" opens the entered URL in the same window without txtdot proxy.

The latest docs for API endpoints can be found here. For handy JSON API, use /api/parse returning an engine result object (see below). For pure HTML response, use /api/raw-html. Note that both API and browser endpoints on txt.dc09.ru are ratelimited to 2 requests per second.

"},{"location":"#how-it-works","title":"How it works","text":"

This project exists thanks to great Mozilla's Readability.js library. The initial idea was to process HTML with it on the server so the client does not need to download and execute heavy JS, doesn't need to use an adblock.

Readability performs its work very well in most cases. But not always. For example, check any StackOverflow page or Google search results.

So artegoser wrote the basis of the code keeping in mind that we'll extend txtdot with other engines. For now, engines are functions taking a URL as a parameter, returning an object that contains extracted HTML and plain text, page title and language. The object is rendered with ejs template (or, in /api/parse, just sent as JSON).

If an ?engine= parameter wasn't passed, but txtdot found that a specific engine is assigned to the requested domain, for example, \"stackoverflow.com\": stackoverflow, it uses that engine to process the URL. Otherwise, the page is parsed with the engine assigned to * (it's Readability).

"},{"location":"selfhost/","title":"Self-Hosting","text":""},{"location":"selfhost/#without-docker","title":"Without Docker","text":"

Install Node and NPM:

# Debian, Ubuntu\nsudo apt install nodejs npm\n# CentOS\nsudo yum install nodejs\n# Arch\nsudo pacman -S nodejs npm\n# Alpine\ndoas apk add nodejs npm\n

Create a user for txtdot, log in:

# Not Alpine (coreutils)\nsudo useradd -r -m -s /sbin/nologin -U txtdot\nsudo -u txtdot -i\n\n# Alpine (busybox)\ndoas addgroup -S txtdot\ndoas adduser -h /home/txtdot -s /sbin/nologin -G txtdot -S -D txtdot\ndoas -u txtdot bash\n

Clone the repo:

git clone https://github.com/txtdot/txtdot.git src\n

Install packages, compile TS:

cd src\nnpm install\nnpm run build\n

Manually start the server to check if it works (Ctrl+C to exit):

npm run start\n

Log out from txtdot account: exit

"},{"location":"selfhost/#add-txtdot-to-autostart","title":"Add txtdot to autostart","text":"

Either using systemd unit file:

wget https://github.com/TxtDot/txtdot/blob/main/txtdot.service\nsudo chown root:root txtdot.service\nsudo chmod 755 txtdot.service\nsudo mv txtdot.service /etc/systemd/system/\nsudo systemctl daemon-reload\nsudo systemctl enable txtdot\nsudo systemctl start txtdot\n

Or using OpenRC script:

wget -O txtdot https://github.com/TxtDot/txtdot/blob/main/txtdot.init\ndoas chown root:root txtdot\ndoas chmod 755 txtdot\ndoas mv txtdot /etc/init.d/\ndoas rc-update add txtdot\ndoas rc-service txtdot start\n

Or using crontab:

sudo crontab -u txtdot -e\n# The command will open an editor\n# Add this line to the end of the file:\n@reboot sleep 10 && cd /home/txtdot/src && npm run start\n# Save the file and exit\n
"},{"location":"selfhost/#with-docker","title":"With Docker","text":"

Docker Engine and Docker Compose are required.

Note that built images are not provided via Docker Hub. If you can't or don't want to build them on your server and don't want to setup a CI/CD system, let us know, we'll consider setting up a GitHub Actions workflow.

git clone https://github.com/txtdot/txtdot.git\ncd txtdot\ndocker compose build\ndocker compose up -d\n
"}]} \ No newline at end of file diff --git a/selfhost/index.html b/selfhost/index.html new file mode 100644 index 0000000..b8fe167 --- /dev/null +++ b/selfhost/index.html @@ -0,0 +1,536 @@ + + + + + + + + + + + + + + + + + + + + + Self-Hosting - txtdot + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + Skip to content + + +
+
+ +
+ + + + + + +
+ + +
+ +
+ + + + + + +
+
+ + + +
+
+
+ + + + + +
+
+
+ + + +
+
+
+ + + +
+
+
+ + + +
+
+ + + + + + + +

Self-Hosting

+

Without Docker

+

Install Node and NPM:

+
# Debian, Ubuntu
+sudo apt install nodejs npm
+# CentOS
+sudo yum install nodejs
+# Arch
+sudo pacman -S nodejs npm
+# Alpine
+doas apk add nodejs npm
+
+

Create a user for txtdot, log in:

+
# Not Alpine (coreutils)
+sudo useradd -r -m -s /sbin/nologin -U txtdot
+sudo -u txtdot -i
+
+# Alpine (busybox)
+doas addgroup -S txtdot
+doas adduser -h /home/txtdot -s /sbin/nologin -G txtdot -S -D txtdot
+doas -u txtdot bash
+
+

Clone the repo:

+
git clone https://github.com/txtdot/txtdot.git src
+
+

Install packages, compile TS:

+
cd src
+npm install
+npm run build
+
+

Manually start the server to check if it works (Ctrl+C to exit):

+
npm run start
+
+

Log out from txtdot account: exit

+

Add txtdot to autostart

+

Either using systemd unit file:

+
wget https://github.com/TxtDot/txtdot/blob/main/txtdot.service
+sudo chown root:root txtdot.service
+sudo chmod 755 txtdot.service
+sudo mv txtdot.service /etc/systemd/system/
+sudo systemctl daemon-reload
+sudo systemctl enable txtdot
+sudo systemctl start txtdot
+
+

Or using OpenRC script:

+
wget -O txtdot https://github.com/TxtDot/txtdot/blob/main/txtdot.init
+doas chown root:root txtdot
+doas chmod 755 txtdot
+doas mv txtdot /etc/init.d/
+doas rc-update add txtdot
+doas rc-service txtdot start
+
+

Or using crontab:

+
sudo crontab -u txtdot -e
+# The command will open an editor
+# Add this line to the end of the file:
+@reboot sleep 10 && cd /home/txtdot/src && npm run start
+# Save the file and exit
+
+

With Docker

+

Docker Engine and Docker Compose are required.

+

Note that built images are not provided via Docker Hub. +If you can't or don't want to build them on your server +and don't want to setup a CI/CD system, +let us know, +we'll consider setting up a GitHub Actions workflow.

+
git clone https://github.com/txtdot/txtdot.git
+cd txtdot
+docker compose build
+docker compose up -d
+
+ + + + + + +
+
+ + +
+ +
+ + + +
+
+
+
+ + + + + + + + + \ No newline at end of file diff --git a/sitemap.xml.gz b/sitemap.xml.gz index 922e62edf904901d6cd8484ab0b6e4d2f57feb90..d6e5a0d3d0ccc699cf5ae55a03d67cc62a2cd4e6 100644 GIT binary patch delta 13 Ucmb=gXP58h;5cRbVIq4403SF6#{d8T delta 13 Ucmb=gXP58h;85azKasrx02)dI>;M1&