Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Entity Parser plugin docs #1121

Merged
merged 53 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
1c0f555
Add grammy-entity plugin
quadratz Sep 9, 2024
e8abe7d
Fix fmt
quadratz Sep 9, 2024
8a46283
Fix typo
quadratz Sep 9, 2024
8670569
Merge branch 'main' into grammy-entity
KnorpelSenf Sep 11, 2024
97fb68b
update docs
quadratz Sep 17, 2024
ec4a786
Merge branch 'main' into grammy-entity
quadratz Sep 18, 2024
af7be6d
update plugin list
quadratz Sep 18, 2024
56a6e47
remove the old file
quadratz Sep 18, 2024
ab91a75
fix formatting
quadratz Sep 18, 2024
1f4bbda
Rename plugin name to `entity-parser`
quadratz Sep 22, 2024
63b72f5
Fix typo entity type `expandableBlockquote`
quadratz Sep 22, 2024
26e163f
Apply suggestions from code review
quadratz Sep 23, 2024
47ae41a
fix fmt
quadratz Sep 23, 2024
e6014ef
fix typo
quadratz Sep 23, 2024
1f889b9
Apply suggestions from code review
quadratz Sep 23, 2024
cd83a90
Apply suggestions from code review
quadratz Sep 23, 2024
8170e59
Sync changes to Indonesian
quadratz Oct 14, 2024
af2832f
Revert "Sync changes to Indonesian"
quadratz Oct 14, 2024
92942ee
Merge branch 'main' into grammy-entity
KnorpelSenf Oct 14, 2024
df69cc5
Merge branch 'main' into grammy-entity
rojvv Oct 15, 2024
17c66a7
Merge branch 'main' into grammy-entity
KnorpelSenf Oct 15, 2024
c69a098
Update site/docs/plugins/entity-parser.md
quadratz Oct 16, 2024
300c5b6
Sync deno 2.0
quadratz Oct 16, 2024
d91166a
Merge branch 'main' into grammy-entity
KnorpelSenf Nov 3, 2024
e9a5025
Merge branch 'main' into grammy-entity
LWJerri Nov 4, 2024
bff39af
Merge branch 'main' into grammy-entity
KnorpelSenf Nov 21, 2024
572984f
Merge branch 'main' into grammy-entity
KnorpelSenf Nov 21, 2024
f0e2f62
sync Russian
MasedMSD Nov 22, 2024
e29a5aa
sync to Ukrainian
niusia-ua Nov 22, 2024
43fd83f
fix fmt errors
MasedMSD Nov 22, 2024
e09e84f
again
MasedMSD Nov 22, 2024
5aaab5d
Update site/docs/uk/plugins/entity-parser.md
LWJerri Nov 22, 2024
4c41245
en: fix formatting
quadratz Nov 22, 2024
d7377d1
sync commit 5740978 to all languages
quadratz Nov 22, 2024
24c5dec
sync changes to indonesian
quadratz Nov 23, 2024
fd584ec
id: fix fmt
quadratz Nov 23, 2024
b1faf0e
little fix
MasedMSD Nov 23, 2024
eb6bc11
fix fmt
MasedMSD Nov 23, 2024
b0b1300
ru: changed order
MasedMSD Dec 3, 2024
ad8ae5c
Add spanish
habemuscode Dec 6, 2024
56a846f
Apply formatter
habemuscode Dec 6, 2024
124f161
Merge branch 'main' into grammy-entity
habemuscode Dec 6, 2024
3e4d055
Apply suggestions from code review
niusia-ua Dec 16, 2024
465daa2
Update site/docs/uk/plugins/entity-parser.md
LWJerri Dec 17, 2024
73cc355
Merge branch 'main' into grammy-entity
KnorpelSenf Dec 21, 2024
b6ff110
Merge branch 'main' into grammy-entity
LWJerri Dec 24, 2024
ce3b123
Merge branch 'main' into grammy-entity
KnorpelSenf Dec 24, 2024
b685ae1
Merge branch 'main' into grammy-entity
MasedMSD Jan 5, 2025
9aa45c9
Merge branch 'main' into grammy-entity
KnorpelSenf Jan 12, 2025
79e5b07
Merge branch 'main' into grammy-entity
KnorpelSenf Jan 13, 2025
b80ac35
Grammy entity zh (#1178)
agoudbg Jan 15, 2025
f9982d7
[Zh] fix "or" translation (#1179)
agoudbg Jan 16, 2025
4c556b7
duzen statt siezen
Jan 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions site/docs/.vitepress/configs/locales/en.ts
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,10 @@ const pluginThirdparty = {
text: "Autoquote",
link: "/plugins/autoquote",
},
{
text: "Entity Parser",
link: "/plugins/entity-parser",
},
{
text: "[Submit your PR!]",
link: "/plugins/#create-your-own-plugins",
Expand Down
249 changes: 249 additions & 0 deletions site/docs/plugins/entity-parser.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
---
prev: false
next: false
---

# Entity Parser (`entity-parser`)

Converts [Telegram entities](https://core.telegram.org/bots/api#messageentity)
to semantic HTML.
quadratz marked this conversation as resolved.
Show resolved Hide resolved

## When Should I Use This?

Probably NEVER!

While this plugin can generate HTML, it's generally best to send the text and entities back to Telegram.

Converting them to HTML is only necessary in rare cases where you need to use Telegram-formatted text **outside** of Telegram itself, such as displaying Telegram messages on a website.

See the [_Cases When It's Better to Not Use This Package_](#cases-when-it-s-better-to-not-use-this-package) section to determine if you have a similar problem to solve.
quadratz marked this conversation as resolved.
Show resolved Hide resolved

If you're unsure whether this plugin is the right fit for your use case, please don't hesitate to ask in our [Telegram group](https://t.me/grammyjs).
In most cases, people find they don't actually need this plugin to solve their problems!
rojvv marked this conversation as resolved.
Show resolved Hide resolved

## Installation

Run the following command in your terminal based on your runtime or package manager:

::: code-group

```sh:no-line-numbers [Deno]
deno add @qz/telegram-entities-parser
quadratz marked this conversation as resolved.
Show resolved Hide resolved
```

```sh:no-line-numbers [Bun]
bunx jsr add @qz/telegram-entities-parser
```

```sh:no-line-numbers [pnpm]
pnpm dlx jsr add @qz/telegram-entities-parser
```

```sh:no-line-numbers [Yarn]
yarn dlx jsr add @qz/telegram-entities-parser
```

```sh:no-line-numbers [npm]
npx jsr add @qz/telegram-entities-parser
```

:::

## Simple Usage

Using this plugin is straightforward.
Here's a quick example:

```ts
import { EntitiesParser } from "@qz/telegram-entities-parser";
import type { Message } from "@qz/telegram-entities-parser/types";

// For better performance, create the instance outside the function.
const entitiesParser = new EntitiesParser();
const parse = (message: Message) => entitiesParser.parse({ message });

bot.on(":text", (ctx) => {
const html = parse(ctx.msg); // Convert text to HTML string
});

bot.on(":photo", (ctx) => {
const html = parse(ctx.msg); // Convert caption to HTML string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref: https://t.me/grammyjs/283562

const bot = new Bot("");

const entitiesParser = new EntitiesParser();
const parse = (message: Message) => entitiesParser.parse({ message });

bot.on(":photo", (ctx) => {
console.log(parse(ctx.msg)); // Перетворення тексту в рядок HTML.
});

bot.start();

This code will fail if I try to send image without caption.

Ouch! It shouldn’t error out if there’s no text or caption. I’ll check it out. Thanks for the heads-up.

});
```

## Advanced usage
quadratz marked this conversation as resolved.
Show resolved Hide resolved

### Customize the Output HTML Tag
quadratz marked this conversation as resolved.
Show resolved Hide resolved

This package converts entities into semantic HTML, adhering to best practices and standards as closely as possible.
However, you might find that the provided output is not exactly what you expected.
quadratz marked this conversation as resolved.
Show resolved Hide resolved

To address this, you can use your own `renderer` to customize the HTML elements surrounding the text according to your rules.
You can modify specific rules by extending the default [`RendererHtml`](https://github.com/quadratz/telegram-entities-parser/blob/main/src/renderers/renderer_html.ts) or override all the rules by implementing the [`Renderer`](https://github.com/quadratz/telegram-entities-parser/blob/main/src/renderers/renderer.ts).

To extend the existing `renderer`, do the following:

```ts
import { EntitiesParser, RendererHtml } from "@qz/telegram-entities-parser";
import type {
CommonEntity,
RendererOutput,
} from "@qz/telegram-entities-parser/types";

// Change the rule for bold type entity,
// but leave the rest of the types as defined by `RendererHtml`.
class MyRenderer extends RendererHtml {
bold(
options: { text: string; entity: CommonEntity },
): RendererOutput {
return {
prefix: '<strong class="tg-bold">',
suffix: "</strong>",
};
}
}

const entitiesParser = new EntitiesParser({ renderer: new MyRenderer() });
```

The `options` parameter accepts an object with `text` and `entity`.

- `text`: The specific text that the current entity refers to.
- `entity`: This may be represented by various interfaces depending on the entity type, such as `CommonEntity`, `CustomEmojiEntity`, `PreEntity`, `TextLinkEntity`, or `TextMentionEntity`.
For instance, the `bold` type has an entity with the `CommonEntity` interface, while the `text_link` type may have an entity with the `TextLinkEntity` interface, as it includes additional properties like `url`.

Here is the full list of interfaces and the output for each entity type:

| Entity Type | Interface | Result |
| ---------------------- | ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `blockquote` | `CommonEntity` | `<blockquote class="tg-blockquote"> ... </blockquote>` |
| `bold` | `CommonEntity` | `<b class="tg-bold"> ... </b>` |
| `bot_command` | `CommonEntity` | `<span class="tg-bot-command"> ... </span>` |
| `cashtag` | `CommonEntity` | `<span class="tg-cashtag"> ... </span>` |
| `code` | `CommonEntity` | `<code class="tg-code"> ... </code>` |
| `custom_emoji` | `CustomEmojiEntity` | `<span class="tg-custom-emoji" data-custom-emoji-id="${options.entity.custom_emoji_id}"> ... </span>` |
| `email` | `CommonEntity` | `<a class="tg-email" href="mailto:${options.text}"> ... </a>` |
| `expandable_blockquote` | `CommonEntity` | `<blockquote class="tg-expandable-blockquote"> ... </blockquote>` |
| `hashtag` | `CommonEntity` | `<span class="tg-hashtag"> ... </span>` |
| `italic` | `CommonEntity` | `<i class="tg-italic"> ... </i>` |
| `mention` | `CommonEntity` | `<a class="tg-mention" href="https://t.me/${username}"> ... </a>` |
| `phone_number` | `CommonEntity` | `<a class="tg-phone-number" href="tel:${options.text}"> ... </a>` |
| `pre` | `PreEntity` | `<pre class="tg-pre-code"><code class="language-${options.entity.language} ... </code></pre>` or `<pre class="tg-pre"> ... </pre>` |
| `spoiler` | `CommonEntity` | `<span class="tg-spoiler"> ... </span>` |
| `strikethrough` | `CommonEntity` | `<del class="tg-strikethrough"> ... </del>` |
| `text_link` | `TextLinkEntity` | `<a class="tg-text-link" href="${options.entity.url}"> ... </a>` |
| `text_mention` | `TextMentionEntity` | `<a class="tg-text-mention" href="https://t.me/${options.entity.user.username}"> ... </a>` or `<a class="tg-text-mention" href="tg://user?id=${options.entity.user.id}"> ... </a>` |
| `underline` | `CommonEntity` | `<span class="tg-bot-command"> ... </span>` |
| `url` | `CommonEntity` | `<a class="tg-url" href="${options.text}"> ... </a>` |

If you are unsure which interface is correct, refer to how the [Renderer](https://github.com/quadratz/telegram-entities-parser/blob/main/src/renderers/renderer.ts) or [RendererHtml](https://github.com/quadratz/telegram-entities-parser/blob/main/src/renderers/renderer_html.ts) is implemented.

### Customize the Text Sanitizer

The output text is sanitized by default to ensure proper HTML rendering and prevent XSS vulnerabilities.

| Input | Output |
| ----- | -------- |
| `&` | `&amp;` |
| `<` | `&lt;` |
| `>` | `&gt;` |
| `"` | `&quot;` |
| `'` | `&#x27;` |

For example, the result `<b>Bold</b> & <i>Italic</i>` will be sanitized to `<b>Bold</b> &amp; <i>Italic</i>`.
MasedMSD marked this conversation as resolved.
Show resolved Hide resolved

You can override this behavior by specifying a `textSanitizer` when instantiating the [`EntitiesParser`](https://github.com/quadratz/telegram-entities-parser/blob/main/src/mod.ts):

- If you do not specify `textSanitizer`, it will default to using [`sanitizerHtml`](https://github.com/quadratz/telegram-entities-parser/blob/main/src/utils/sanitizer_html.ts) as the sanitizer.
- Setting the value to `false` will skip sanitization, keeping the output text as the original.
This is not recommended, as it may result in incorrect rendering and make your application vulnerable to XSS attacks.
Ensure proper handling if you choose this option.
- If you provide a function, it will be used instead of the default sanitizer.

Example,
quadratz marked this conversation as resolved.
Show resolved Hide resolved

```ts
const myTextSanitizer: TextSanitizer = (options: TextSanitizerOption): string =>
// Replace dangerous character
options.text.replace(/[&<>"']/g, (match) => {
quadratz marked this conversation as resolved.
Show resolved Hide resolved
switch (match) {
case "&":
return "&amp;";
case "<":
return "&lt;";
case ">":
return "&gt;";
case '"':
return "&quot;";
case "'":
return "&#x27;";
default:
return match;
}
});

// Implement the sanitizer.
const entitiesParser = new EntitiesParser({ textSanitizer: myTextSanitizer });
```

## Cases When It's Better to Not Use This Package
quadratz marked this conversation as resolved.
Show resolved Hide resolved

If you face problems similar to those listed below, you might be able to resolve them without using this package.

### Copy and Forward the Same Message
quadratz marked this conversation as resolved.
Show resolved Hide resolved

Use [`forwardMessage`](https://core.telegram.org/bots/api#forwardmessage) to forward messages of any kind.

You can also use the [`copyMessage`](https://core.telegram.org/bots/api#copymessage) API, which performs the same action but does not include a link to the original message.
[`copyMessage`](https://core.telegram.org/bots/api#copymessage) behaves like copying the message and sending it back to Telegram, making it appear as a regular message rather than a forwarded one.

Example:
quadratz marked this conversation as resolved.
Show resolved Hide resolved

```ts
bot.on(":text", async (ctx) => {
// The target chat id to send.
const chatId = "-946659600";
quadratz marked this conversation as resolved.
Show resolved Hide resolved
// Forward the current message without a link to the original message.
await ctx.copyMessage(chatId);
// Forward the current message with a link to the original message.
await ctx.forwardMessage(chatId);
});
```

### Reply to Messages with Modified Text Format
quadratz marked this conversation as resolved.
Show resolved Hide resolved

You can easily reply to incoming messages using HTML, Markdown, or entities.

```ts
bot.on(":text", async (ctx) => {
// Reply using HTML
await ctx.reply("<b>bold</b> <i>italic</i>", { parse_mode: "HTML" });
// Reply using Telegram Markdown V2
await ctx.reply("*bold* _italic_", { parse_mode: "MarkdownV2" });
// Reply with entities
await ctx.reply("bold italic", {
entities: [
{ offset: 0, length: 5, type: "bold" },
{ offset: 5, length: 6, type: "italic" },
],
});
});
```

::: tip Use parse-mode for a Better Formatting Experience

grammY also provides a useful plugin called [`parse-mode`](./parse-mode) for better message formatting.
quadratz marked this conversation as resolved.
Show resolved Hide resolved
You can format messages like this:

```ts
ctx.replyFmt(fmt`${bold("bold")} ${italic("italic")}`);
```

[Check it out](./parse-mode) if you're interested.
:::
quadratz marked this conversation as resolved.
Show resolved Hide resolved

## Plugin Summary

- Name: entity-parser
- Package: <https://jsr.io/@qz/telegram-entities-parser>
- Source: <https://github.com/quadratz/telegram-entities-parser>
quadratz marked this conversation as resolved.
Show resolved Hide resolved
Loading