Skip to content

markdown escaping: Fix injection of escapes#1822

Merged
andersonhc merged 2 commits into
py-pdf:masterfrom
sthibaul:escape
May 19, 2026
Merged

markdown escaping: Fix injection of escapes#1822
andersonhc merged 2 commits into
py-pdf:masterfrom
sthibaul:escape

Conversation

@sthibaul
Copy link
Copy Markdown

@sthibaul sthibaul commented Apr 16, 2026

c437630 ("Fix markdown character escaping bug (issue #1236)")

fixed the escaping of multiple markdown markers, but it dropped dividing by two the number of escaped characters being injected:

-                        txt_frag[: -((num_escape_chars + 1) // 2)]

+                    for _ in range(escape_run - 1):

This restores dividing by two the number of escapes, and restores the test results accordingly, making it the behavior closer to https://spec.commonmark.org/dingus/

Fixes: #1215

  • A unit test is covering the code added / modified by this PR

  • In case of a new feature, docstrings have been added, with also some documentation in the docs/ folder

  • A mention of the change is present in CHANGELOG.md

  • This PR is ready to be merged

By submitting this pull request, I confirm that my contribution is made under the terms of the GNU LGPL 3.0 license.

c437630 ("Fix markdown character escaping bug (issue py-pdf#1236)")

fixed the escaping of multiple markdown markers, but it dropped dividing
by two the number of escaped characters being injected:

-                        txt_frag[: -((num_escape_chars + 1) // 2)]

+                    for _ in range(escape_run - 1):

This restores dividing by two the number of escapes, and restores the
test results accordingly, making it the behavior closer to
https://spec.commonmark.org/dingus/

Fixes: py-pdf#1215
@andersonhc
Copy link
Copy Markdown
Collaborator

Thanks for opening this PR @sthibaul , and sorry it has taken me so long to respond.

What I’m trying to evaluate is that this PR introduces new behavior for backslashes that are not escaping markdown markers. For example, \\abc renders as \abc with this PR, while in every older fpdf2 version I tested it renders as \\abc.

I understand this moves us closer to CommonMark-style escaping, but it also changes existing behavior for users of markdown=True, so I’m not sure yet how to move forward.

I am currently leaning towards merging it. @Lucas-C or @CoLa5 if you have any input I would appreciate.

@CoLa5
Copy link
Copy Markdown

CoLa5 commented May 16, 2026

  1. To make the discussion clear:

    • Everything in "…" (backticked quotes) means a python string:
      "Python string"

    • Everything as >-citation in markdown with [PDF] -prefix -means "as displayed in any PDF reader" and as shown when (pre-) viewing a markdown text parsed to HTML, not when displaying raw markdown:

      [PDF]
      Text as displayed in PDF

  2. Since it is about markdown, for FPDF.cell and FPDF.multi_cell, markdown=True is assumed!

  3. A single backslash "\" is an escape symbol in a python string and composes with some consecutive characters a single character in python:
    len("\a") == len("\n") == len("\\") == 1 (cf. definitions).

  4. To make things more complicated:
    For non-defined escape combinations, the single backslash is counted as normal character:

    >>> len("\*") == 2
    <stdin>:1: SyntaxWarning: invalid escape sequence '\*'
    True

    So the combination is treated as two separate characters. The expression is accompanied by a SyntaxWarning, proving that this is actually not a valid string in Python.

  5. The issue does strongly relate to the discussion in Markdown not escaping MD special characters #1215. To not mix the discussions, I would keep the discussion whether or not FPDF should comply to Commonmark there because it is a design decision with no right or wrong!

Consequently, here I am assuming the current FPDF-specified markdown syntax as being defined by:

  • double character-based style markers ("**", "--", "__", "~~") and
  • a single backslash character ("\\") as markdown escape marker.

Coming to the issue:

Starting with the simple escape case:

Python PDF
"\\**not bold\\** > [PDF]
> **not bold**

The case defines clearly that a leading backslash character "\\" escapes the
double character-based style markers.

The arising questions are:

  • How to show a single backslash character \ in FPDF?

  • How to escape the backslash character directly in front of a style marker to
    print:

    [PDF]
    \bold text

    but not escape the style marker.

For me, both questions lead to same answer, escape the escape character to print a backslash:

Question Python PDF
1 "\\\\" > [PDF]
> \
2 "\\\\**bold** text" > [PDF]
> \bold text

Currently (before this pull request), the escaping of the markdown escape character (case 2) is wrong because the escaping of the escape character is printed as:

[PDF]
\\bold text

making it impossible to show a single backslash in front a markdown style marker.

So, this pull request seems to fix this issue, so I would accept it. All the versions between c437630 and now had a bug introduced by the referenced commit which is fixed by this pull request.


The only remaining question is:
If markdown=True, how to handle single appearances of the markdown escape character without following markdown markers?

I would suggest to follow Python style, accept them and print them as being escaped, accompanied by a SyntaxWarning and when using FPDF.multi_cell(..., dry_run=True, output="LINES") to always return them escaped:

>>> pdf.multi_cell(
...     w=pdf.epw,
...     text="\\ example \\**text \\\\__case__",
...     dry_run=True,
...     output="LINES",
... )
["\\\\ example \\**text \\\\__case__"]

but I will open another issue for this!

@andersonhc andersonhc merged commit a25e56f into py-pdf:master May 19, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Markdown not escaping MD special characters

3 participants