panettone: some unicode codepoints are converted into gibberish

#107
Opened by sterni at 2021-04-05T00·16+00

I tried to use the set union operator U+222A in b/104, but it went horribly wrong, as you can see.

Pasting it here again: .

  1. interestingly, the email came through perfectly fine.

    grfn at 2021-04-06T02·01+00

  2. Seems to be caused by RENDER-MARKDOWN. Issue titles are completely fine for example, but I can reproduce the issue on the repl:

    * (in-package :panettone)
    #<PACKAGE "PANETTONE">
    * (render-markdown "⊂∫∪")
    "<p>ââ«âª</p>
    "
    

    sterni at 2021-04-06T08·38+00

  3. Turns out that the problem is caused by drakma or more specifically flexi-stream which changes read-char in a way that it chokes on UTF-8 (even though external-format is utf-8):

    (let
      ((s (drakma:http-request
           "http://localhost:4238/markdown"
           :method :post :content-type "application/json"
           :accept "application/json"
           :content "{ \"markdown\": \"∪nion\" }"
           :external-format-out :utf-8 :external-format-in :utf-8
           :want-stream t)))
      (loop for x = (read-char s nil nil) until (not x) collect x))
    

    Results in:

    (#\{ #\" #\m #\a #\r #\k #\d #\o #\w #\n #\" #\: #\" #\< #\p #\>
     #\LATIN_SMALL_LETTER_A_WITH_CIRCUMFLEX #\Character-Tabulation-Set
     #\FEMININE_ORDINAL_INDICATOR #\n #\i #\o #\n #\< #\/ #\p #\> #\\ #\n #\" #\})
    

    sterni at 2021-04-06T11·32+00

  4. sterni closed this issue at 2021-04-09T19·12+00
  5. In a different instance this still happens: If you have a string consisting of #\MANTELPIECE_CLOCK and #\VARIATION_SELECTOR-16, i. e. ὗ0️ (comes out as gibberish, but is saved correctly as observable in the text editor):

    CL-USER> (in-package :panettone)
    #<PACKAGE "PANETTONE">
    PANETTONE> (render-markdown "`ὗ0️`")
    "<p><code>ὗ0️</code></p>
    "
    

    The variation selector doesn't seem to be the problem (in fact it is preserved correctly in the example if you check carefully), but rather #\MANTELPIECE_CLOCK in itself.

    sterni at 2021-07-25T22·15+00

  6. sterni reopened this issue at 2021-07-25T22·15+00
  7. The partial fix with some additional context was 435b883f5cf1839cb2f4089e2bfa6e2e5427aced.

    sterni at 2021-07-25T22·18+00

  8. b/145

    sterni at 2021-09-08T15·30+00

  9. sterni closed this issue at 2021-09-08T15·30+00