update env and plugins info

This commit is contained in:
Artemy 2024-05-27 12:34:37 +03:00
parent ea47bafb84
commit 97f52649e6
5 changed files with 93 additions and 13 deletions

View file

@ -4,7 +4,9 @@ txtdot can be configured either with environment variables
or with the `.env` file in the working directory which has higher priority. or with the `.env` file in the working directory which has higher priority.
For sample config, see [`.env.example`](https://github.com/TxtDot/txtdot/blob/main/.env.example). For sample config, see [`.env.example`](https://github.com/TxtDot/txtdot/blob/main/.env.example).
## HOST ## Server Settings
### HOST
Default: `0.0.0.0` Default: `0.0.0.0`
@ -12,28 +14,54 @@ Host where HTTP server should listen for connections.
Set it to `127.0.0.1` if your txtdot instance is behind reverse proxy, Set it to `127.0.0.1` if your txtdot instance is behind reverse proxy,
`0.0.0.0` otherwise. `0.0.0.0` otherwise.
## PORT ### PORT
Default: `8080` Default: `8080`
Port where HTTP server should listen for connections. Port where HTTP server should listen for connections.
## REVERSE_PROXY ### Timeout
Default: `0`
Max response time in milliseconds. If it's reached, the request is aborted. If set to `0`, the timeout is disabled.
### REVERSE_PROXY
Default: `false` Default: `false`
Set it to `true` only if your txtdot instance runs behind reverse proxy. Set it to `true` only if your txtdot instance runs behind reverse proxy.
Needed for processing X-Forwarded headers. Needed for processing X-Forwarded headers.
## PROXY_RES ## Proxy
### PROXY_RES
Default: `true` Default: `true`
Whether to allow proxying images, video, audio Whether to allow proxying images, video, audio
and everything else through your txtdot instance. and everything else through your txtdot instance.
## SWAGGER ### IMG_COMPRESS
Default: `true`
Whether to compress images through your txtdot instance.
## Documentation
### SWAGGER
Default: `false` Default: `false`
Whether to add `/doc` route for Swagger API docs. Whether to add `/doc` route for Swagger API docs.
## Third-party
### SEARX_URL
SearXNG base URL, if set, txtdot will use it for searching and add search form to the page with /search route.
### WEBDER_URL
Webder base URL, if set, txtdot will use it for rendering web pages.

View file

@ -38,16 +38,68 @@ so the client does not need to download and execute heavy JS,
doesn't need to use an adblock. doesn't need to use an adblock.
Readability performs its work very well in most cases. Readability performs its work very well in most cases.
But not always. For example, check any StackOverflow page or Google search results.
So [artegoser](https://github.com/artegoser) wrote the basis of the code
keeping in mind that we'll extend txtdot with other _engines_.
For now, engines are functions taking a URL as a parameter,
returning an object that contains extracted HTML and plain text, page title and language.
The object is rendered with ejs template (or, in `/api/parse`, just sent as JSON).
If an `?engine=` parameter wasn't passed, but txtdot found If an `?engine=` parameter wasn't passed, but txtdot found
that a specific engine is assigned to the requested domain, that a specific engine is assigned to the requested domain,
for example, `"stackoverflow.com": stackoverflow`, for example, `"stackoverflow.com": engines.StackOverflow`,
it uses that engine to process the URL. it uses that engine to process the URL.
Otherwise, the page is parsed with the engine assigned to `*` (it's Readability). Otherwise, the page is parsed with the engine assigned to `*` (it's Readability).
### Plugins
Readability is good, but now always, so [artegoser](https://github.com/artegoser) wrote the basis of the code
keeping in mind that we'll extend txtdot with other _engines_.
Back then, it was functions taking a URL as a parameter,
returning an object that contains extracted HTML and plain text, page title and language.
The object is rendered with ejs template (or, in `/api/parse`, just sent as JSON).
But after a while it became unwieldy and we decided to create a monorepo. We created classes Engines, Middlewares that handle the necessary parts. Now you can create such functions for different domains, and routes. Also we added support for JSX for simplifying the code of plugins.
## Engines
Creation of engines is easy.
```ts
import { Engine, Route } from "@txtdot/sdk";
const Readability = new Engine(
"Readability", // Name of the engine
"Engine for parsing content with Readability", // Description
["*"] // Domains that use this engine
);
Readability.route("*path", async (input, ro: Route<{ path: string }>) => {
// ...
// If any of the parameters except content is empty, txtdot will try to extract it from the page automatically
return {
content: parsed.content,
title: parsed.title,
lang: parsed.lang,
};
});
```
## Middlewares
Creation of middlewares similar to engines.
```tsx
import { Middleware } from "@txtdot/sdk";
const Highlight = new Middleware(
"Highlight",
"Highlights code with highlight.js only when needed",
["*"]
);
Highlight.use(async (input, ro, out) => {
if (out.content.indexOf("<code") !== -1)
return {
...out,
content: <Highlighter content={out.content} />,
};
return out;
});
```