annotate src/plugins/plugin_misc_text_syntaxes.py @ 1955:633b5c21aefd

backend, frontend: messages refactoring (huge commit, not finished): /!\ database schema has been modified, do a backup before updating message have been refactored, here are the main changes: - languages are now handled - all messages have an uid (internal to SàT) - message updating is anticipated - subject is now first class - new naming scheme is used newMessage => messageNew, getHistory => historyGet, sendMessage => messageSend - minimal compatibility refactoring in quick_frontend/Primitivus, better refactoring should follow - threads handling - delayed messages are saved into history - info messages may also be saved in history (e.g. to keep track of people joining/leaving a room) - duplicate messages should be avoided - historyGet return messages in right order, no need to sort again - plugins have been updated to follow new features, some of them need to be reworked (e.g. OTR) - XEP-0203 (Delayed Delivery) is now fully handled in core, the plugin just handle disco and creation of a delay element - /!\ jp and Libervia are currently broken, as some features of Primitivus It has been put in one huge commit to avoid breaking messaging between changes. This is the main part of message refactoring, other commits will follow to take profit of the new features/behaviour.
author Goffi <goffi@goffi.org>
date Tue, 24 May 2016 22:11:04 +0200
parents 2daf7b4c6756
children 5874da3811b7
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1934
2daf7b4c6756 use of /usr/bin/env instead of /usr/bin/python in shebang
Goffi <goffi@goffi.org>
parents: 1867
diff changeset
1 #!/usr/bin/env python2
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
2 # -*- coding: utf-8 -*-
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
3
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
4 # SAT plugin for managing various text syntaxes
1766
d17772b0fe22 copyright update
Goffi <goffi@goffi.org>
parents: 1542
diff changeset
5 # Copyright (C) 2009-2016 Jérôme Poisson (goffi@goffi.org)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
6
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
7 # This program is free software: you can redistribute it and/or modify
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
8 # it under the terms of the GNU Affero General Public License as published by
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
9 # the Free Software Foundation, either version 3 of the License, or
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
10 # (at your option) any later version.
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
11
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
12 # This program is distributed in the hope that it will be useful,
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
13 # but WITHOUT ANY WARRANTY; without even the implied warranty of
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
14 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
15 # GNU Affero General Public License for more details.
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
16
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
17 # You should have received a copy of the GNU Affero General Public License
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
18 # along with this program. If not, see <http://www.gnu.org/licenses/>.
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
19
771
bfabeedbf32e core: i18n refactoring:
Goffi <goffi@goffi.org>
parents: 744
diff changeset
20 from sat.core.i18n import _, D_
993
301b342c697a core: use of the new core.log module:
Goffi <goffi@goffi.org>
parents: 968
diff changeset
21 from sat.core.log import getLogger
301b342c697a core: use of the new core.log module:
Goffi <goffi@goffi.org>
parents: 968
diff changeset
22 log = getLogger(__name__)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
23
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
24 from twisted.internet import defer
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
25 from twisted.internet.threads import deferToThread
705
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
26 from sat.core import exceptions
1542
94901070478e plugins: added new MissingModule exceptions to plugins using third party modules
Goffi <goffi@goffi.org>
parents: 1458
diff changeset
27 try:
94901070478e plugins: added new MissingModule exceptions to plugins using third party modules
Goffi <goffi@goffi.org>
parents: 1458
diff changeset
28 from lxml import html
94901070478e plugins: added new MissingModule exceptions to plugins using third party modules
Goffi <goffi@goffi.org>
parents: 1458
diff changeset
29 from lxml.html import clean
94901070478e plugins: added new MissingModule exceptions to plugins using third party modules
Goffi <goffi@goffi.org>
parents: 1458
diff changeset
30 except ImportError:
94901070478e plugins: added new MissingModule exceptions to plugins using third party modules
Goffi <goffi@goffi.org>
parents: 1458
diff changeset
31 raise exceptions.MissingModule(u"Missing module lxml, please download/install it from http://lxml.de/")
832
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
32 from cgi import escape
692
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
33 import re
674
fb0b1100c908 plugin text_syntaxes: fixed clean_xhml (it now return XHTML instead of HTML)
Goffi <goffi@goffi.org>
parents: 665
diff changeset
34
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
35
771
bfabeedbf32e core: i18n refactoring:
Goffi <goffi@goffi.org>
parents: 744
diff changeset
36 CATEGORY = D_("Composition")
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
37 NAME = "Syntax"
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
38 _SYNTAX_XHTML = "XHTML"
744
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
39 _SYNTAX_CURRENT = "@CURRENT@"
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
40
692
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
41 # TODO: check/adapt following list
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
42 # list initialy based on feedparser list (http://pythonhosted.org/feedparser/html-sanitization.html)
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
43 STYLES_WHITELIST = ("azimuth", "background-color", "border-bottom-color", "border-collapse", "border-color", "border-left-color", "border-right-color", "border-top-color", "clear", "color", "cursor", "direction", "display", "elevation", "float", "font", "font-family", "font-size", "font-style", "font-variant", "font-weight", "height", "letter-spacing", "line-height", "overflow", "pause", "pause-after", "pause-before", "pitch", "pitch-range", "richness", "speak", "speak-header", "speak-numeral", "speak-punctuation", "speech-rate", "stress", "text-align", "text-decoration", "text-indent", "unicode-bidi", "vertical-align", "voice-family", "volume", "white-space", "width")
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
44
1867
47108a4f3a70 plugin text syntaxes: added 'poster' and 'controls' in allowed attributes for cleanXHTML
Goffi <goffi@goffi.org>
parents: 1847
diff changeset
45 SAFE_ATTRS = html.defs.safe_attrs.union(('style', 'poster', 'controls'))
692
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
46 STYLES_VALUES_REGEX = r'^(' + '|'.join(['([a-z-]+)', # alphabetical names
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
47 '(#[0-9a-f]+)', # hex value
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
48 '(\d+(.\d+)? *(|%|em|ex|px|in|cm|mm|pt|pc))', # values with units (or not)
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
49 'rgb\( *((\d+(.\d+)?), *){2}(\d+(.\d+)?) *\)', # rgb function
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
50 'rgba\( *((\d+(.\d+)?), *){3}(\d+(.\d+)?) *\)', # rgba function
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
51 ]) + ') *(!important)?$' # we accept "!important" at the end
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
52 STYLES_ACCEPTED_VALUE = re.compile(STYLES_VALUES_REGEX)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
53
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
54 PLUGIN_INFO = {
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
55 "name": "Text syntaxes",
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
56 "import_name": "TEXT-SYNTAXES",
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
57 "type": "MISC",
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
58 "protocols": [],
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
59 "dependencies": [],
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
60 "main": "TextSyntaxes",
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
61 "handler": "no",
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
62 "description": _("""Management of various text syntaxes (XHTML-IM, Markdown, etc)""")
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
63 }
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
64
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
65 class UnknownSyntax(Exception):
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
66 pass
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
67
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
68 class TextSyntaxes(object):
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
69 """ Text conversion class
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
70 XHTML utf-8 is used as intermediate language for conversions
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
71 """
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
72
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
73 OPT_DEFAULT = "DEFAULT"
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
74 OPT_HIDDEN = "HIDDEN"
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
75 OPT_NO_THREAD = "NO_THREAD"
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
76 SYNTAX_XHTML = _SYNTAX_XHTML
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
77 SYNTAX_MARKDOWN = "markdown"
832
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
78 SYNTAX_TEXT = "text"
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
79
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
80 params = """
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
81 <params>
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
82 <individual>
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
83 <category name="%(category_name)s" label="%(category_label)s">
968
75f3b3b430ff tools, frontends, memory: param definition and XMLUI handle multi-selection for list widgets:
souliane <souliane@mailoo.org>
parents: 852
diff changeset
84 <param name="%(name)s" label="%(label)s" type="list" security="0">
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
85 %(options)s
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
86 </param>
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
87 </category>
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
88 </individual>
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
89 </params>
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
90 """
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
91
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
92 params_data = {
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
93 'category_name': CATEGORY,
771
bfabeedbf32e core: i18n refactoring:
Goffi <goffi@goffi.org>
parents: 744
diff changeset
94 'category_label': _(CATEGORY),
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
95 'name': NAME,
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
96 'label': _(NAME),
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
97 'syntaxes': {},
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
98 }
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
99
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
100 def __init__(self, host):
993
301b342c697a core: use of the new core.log module:
Goffi <goffi@goffi.org>
parents: 968
diff changeset
101 log.info(_("Text syntaxes plugin initialization"))
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
102 self.host = host
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
103 self.syntaxes = {}
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
104 self.addSyntax(self.SYNTAX_XHTML, lambda xhtml: defer.succeed(xhtml), lambda xhtml: defer.succeed(xhtml),
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
105 TextSyntaxes.OPT_NO_THREAD)
1826
d80ccf4bf201 plugin blog import dotclear: this plugin import Dotclear 2 backups
Goffi <goffi@goffi.org>
parents: 1811
diff changeset
106 # TODO: text => XHTML should add <a/> to url like in frontends
d80ccf4bf201 plugin blog import dotclear: this plugin import Dotclear 2 backups
Goffi <goffi@goffi.org>
parents: 1811
diff changeset
107 # it's probably best to move sat_frontends.tools.strings to sat.tools.common or similar
832
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
108 self.addSyntax(self.SYNTAX_TEXT, lambda text: escape(text), lambda xhtml: self._removeMarkups(xhtml), [TextSyntaxes.OPT_HIDDEN])
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
109 try:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
110 import markdown, html2text
841
831f208b4ea3 plugin text_syntaxes: html2text was breaking the long URLs
souliane <souliane@mailoo.org>
parents: 836
diff changeset
111
831f208b4ea3 plugin text_syntaxes: html2text was breaking the long URLs
souliane <souliane@mailoo.org>
parents: 836
diff changeset
112 def _html2text(html, baseurl=''):
831f208b4ea3 plugin text_syntaxes: html2text was breaking the long URLs
souliane <souliane@mailoo.org>
parents: 836
diff changeset
113 h = html2text.HTML2Text(baseurl=baseurl)
831f208b4ea3 plugin text_syntaxes: html2text was breaking the long URLs
souliane <souliane@mailoo.org>
parents: 836
diff changeset
114 h.body_width = 0 # do not truncate the lines, it breaks the long URLs
831f208b4ea3 plugin text_syntaxes: html2text was breaking the long URLs
souliane <souliane@mailoo.org>
parents: 836
diff changeset
115 return h.handle(html)
831f208b4ea3 plugin text_syntaxes: html2text was breaking the long URLs
souliane <souliane@mailoo.org>
parents: 836
diff changeset
116 self.addSyntax(self.SYNTAX_MARKDOWN, markdown.markdown, _html2text, [TextSyntaxes.OPT_DEFAULT])
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
117 except ImportError:
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
118 log.warning(u"markdown or html2text not found, can't use Markdown syntax")
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
119 log.info(u"You can download/install them from https://pythonhosted.org/Markdown/ and https://github.com/Alir3z4/html2text/")
744
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
120 host.bridge.addMethod("syntaxConvert", ".plugin", in_sign='sssbs', out_sign='s',
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
121 async=True, method=self.convert)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
122
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
123 def _updateParamOptions(self):
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
124 data_synt = TextSyntaxes.params_data['syntaxes']
1847
d8c6c55aafd7 plugin text syntaxes: fixed defaut syntax handling
Goffi <goffi@goffi.org>
parents: 1826
diff changeset
125 default_synt = TextSyntaxes.params_data.get('default', _SYNTAX_XHTML)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
126 syntaxes = []
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
127
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
128 for syntax in data_synt.keys():
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
129 flags = data_synt[syntax]["flags"]
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
130 if TextSyntaxes.OPT_HIDDEN not in flags:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
131 syntaxes.append(syntax)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
132
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
133 syntaxes.sort(key=lambda synt: synt.lower())
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
134 options = []
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
135
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
136 for syntax in syntaxes:
1847
d8c6c55aafd7 plugin text syntaxes: fixed defaut syntax handling
Goffi <goffi@goffi.org>
parents: 1826
diff changeset
137 selected = 'selected="true"' if syntax == default_synt else ''
968
75f3b3b430ff tools, frontends, memory: param definition and XMLUI handle multi-selection for list widgets:
souliane <souliane@mailoo.org>
parents: 852
diff changeset
138 options.append(u'<option value="%s" %s/>' % (syntax, selected))
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
139
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
140 TextSyntaxes.params_data["options"] = u'\n'.join(options)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
141 self.host.memory.updateParams(TextSyntaxes.params % TextSyntaxes.params_data)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
142
702
a25db3fe3959 plugin XEP-0071: rich messages management for sendMessage
Goffi <goffi@goffi.org>
parents: 699
diff changeset
143 def getCurrentSyntax(self, profile):
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
144 """ Return the selected syntax for the given profile
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
145
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
146 @param profile: %(doc_profile)s
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
147 @return: profile selected syntax
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
148 """
699
e819630c6737 plugin text syntaxes: fixed bad parameter order for getParamA
Goffi <goffi@goffi.org>
parents: 695
diff changeset
149 return self.host.memory.getParamA(NAME, CATEGORY , profile_key=profile)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
150
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
151 def cleanXHTML(self, xhtml):
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
152 """ Clean XHTML text by removing potentially dangerous/malicious parts
705
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
153 @param xhtml: raw xhtml text to clean (or lxml's HtmlElement)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
154 """
674
fb0b1100c908 plugin text_syntaxes: fixed clean_xhml (it now return XHTML instead of HTML)
Goffi <goffi@goffi.org>
parents: 665
diff changeset
155 def blocking_cleaning(xhtml):
692
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
156 """ Clean XHTML and style attributes """
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
157
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
158 def clean_style(styles_raw):
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
159 """" Remove styles not in the whitelist,
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
160 or where the value doesn't match the regex """
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
161 styles = styles_raw.split(";")
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
162 cleaned_styles = []
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
163 for style in styles:
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
164 try:
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
165 key, value = style.split(':')
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
166 except ValueError:
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
167 continue
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
168 key = key.lower().strip()
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
169 if key not in STYLES_WHITELIST:
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
170 continue
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
171 value = value.lower().strip()
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
172 if not STYLES_ACCEPTED_VALUE.match(value):
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
173 continue
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
174 if value == "none":
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
175 continue
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
176 cleaned_styles.append((key, value))
1458
832846fefe85 plugin text-syntaxes: minor variable renaming
Goffi <goffi@goffi.org>
parents: 1396
diff changeset
177 return "; ".join(["%s: %s" % (key_, value_) for key_, value_ in cleaned_styles])
692
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
178
705
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
179 if isinstance(xhtml, basestring):
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
180 xhtml_elt = html.fromstring(xhtml)
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
181 elif isinstance(xhtml, html.HtmlElement):
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
182 xhtml_elt = xhtml
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
183 else:
993
301b342c697a core: use of the new core.log module:
Goffi <goffi@goffi.org>
parents: 968
diff changeset
184 log.error("Only strings and HtmlElements can be cleaned")
705
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
185 raise exceptions.DataError
674
fb0b1100c908 plugin text_syntaxes: fixed clean_xhml (it now return XHTML instead of HTML)
Goffi <goffi@goffi.org>
parents: 665
diff changeset
186 cleaner = clean.Cleaner(style=False,
fb0b1100c908 plugin text_syntaxes: fixed clean_xhml (it now return XHTML instead of HTML)
Goffi <goffi@goffi.org>
parents: 665
diff changeset
187 add_nofollow=False,
692
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
188 safe_attrs=SAFE_ATTRS)
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
189 xhtml_elt = cleaner.clean_html(xhtml_elt)
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
190 for elt in xhtml_elt.xpath("//*[@style]"):
e98db42cd78c plugin text syntaxes: styles sanitisation
Goffi <goffi@goffi.org>
parents: 674
diff changeset
191 elt.set("style", clean_style(elt.get('style')))
852
4cc55e05266d plugin text syntaxes: fixed cleaners encoding
Goffi <goffi@goffi.org>
parents: 841
diff changeset
192 return html.tostring(xhtml_elt, encoding=unicode, method='xml')
674
fb0b1100c908 plugin text_syntaxes: fixed clean_xhml (it now return XHTML instead of HTML)
Goffi <goffi@goffi.org>
parents: 665
diff changeset
193
705
6c8a119dcc94 plugin text syntaxes: clean_xhtml now accept lxml's HtmlElement to avoid parsing two times the same xml
Goffi <goffi@goffi.org>
parents: 702
diff changeset
194 return deferToThread(blocking_cleaning, xhtml)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
195
744
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
196 def convert(self, text, syntax_from, syntax_to=_SYNTAX_XHTML, safe=True, profile=None):
1803
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
197 """Convert a text between two syntaxes
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
198
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
199 @param text: text to convert
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
200 @param syntax_from: source syntax (e.g. "markdown")
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
201 @param syntax_to: dest syntax (e.g.: "XHTML")
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
202 @param safe: clean resulting XHTML to avoid malicious code if True
744
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
203 @param profile: needed only when syntax_from or syntax_to is set to _SYNTAX_CURRENT
1803
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
204 @return(unicode): converted text
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
205 """
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
206 # FIXME: convert should be abled to handle domish.Element directly
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
207 # when dealing with XHTML
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
208 # TODO: a way for parser to return parsing errors/warnings
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
209
744
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
210 if syntax_from == _SYNTAX_CURRENT:
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
211 syntax_from = self.getCurrentSyntax(profile)
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
212 if syntax_to == _SYNTAX_CURRENT:
312a2842b2b8 plugins text-syntaxes: added a default value to use the current user syntax in convert
souliane <souliane@mailoo.org>
parents: 705
diff changeset
213 syntax_to = self.getCurrentSyntax(profile)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
214 syntaxes = TextSyntaxes.params_data['syntaxes']
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
215 if syntax_from not in syntaxes:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
216 raise UnknownSyntax(syntax_from)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
217 if syntax_to not in syntaxes:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
218 raise UnknownSyntax(syntax_to)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
219 d = None
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
220
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
221 if TextSyntaxes.OPT_NO_THREAD in syntaxes[syntax_from]["flags"]:
1803
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
222 d = defer.maybeDeferred(syntaxes[syntax_from]["to"], text)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
223 else:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
224 d = deferToThread(syntaxes[syntax_from]["to"], text)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
225
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
226 #TODO: keep only body element and change it to a div here ?
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
227
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
228 if safe:
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
229 d.addCallback(self.cleanXHTML)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
230
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
231 if TextSyntaxes.OPT_NO_THREAD in syntaxes[syntax_to]["flags"]:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
232 d.addCallback(syntaxes[syntax_to]["from"])
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
233 else:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
234 d.addCallback(lambda xhtml: deferToThread(syntaxes[syntax_to]["from"], xhtml))
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
235
836
2cc0201b4613 plugin text_syntaxes: rstrip the conversion result to avoid new lines systematically added by converters (e.g. html2text do this)
souliane <souliane@mailoo.org>
parents: 832
diff changeset
236 # converters can add new lines that disturb the microblog change detection
2cc0201b4613 plugin text_syntaxes: rstrip the conversion result to avoid new lines systematically added by converters (e.g. html2text do this)
souliane <souliane@mailoo.org>
parents: 832
diff changeset
237 d.addCallback(lambda text: text.rstrip())
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
238 return d
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
239
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
240 def addSyntax(self, name, to_xhtml_cb, from_xhtml_cb, flags = None):
1803
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
241 """Add a new syntax to the manager
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
242
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
243 @param name: unique name of the syntax
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
244 @param to_xhtml_cb: callback to convert from syntax to XHTML
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
245 @param from_xhtml_cb: callback to convert from XHTML to syntax
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
246 @param flags: set of optional flags, can be:
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
247 TextSyntaxes.OPT_DEFAULT: use as the default syntax (replace former one)
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
248 TextSyntaxes.OPT_HIDDEN: do not show in parameters
1803
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
249 TextSyntaxes.OPT_NO_THREAD: do not defer to thread when converting (the callback may then return a deferred)
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
250 """
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
251 flags = flags if flags is not None else []
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
252 if TextSyntaxes.OPT_HIDDEN in flags and TextSyntaxes.OPT_DEFAULT in flags:
1803
14a97a5fe1c0 plugin text syntaxes: a non blocking syntax callback can now return a unicode directly instead of a Deferred
Goffi <goffi@goffi.org>
parents: 1766
diff changeset
253 raise ValueError(u"{} and {} are mutually exclusive".format(TextSyntaxes.OPT_HIDDEN, TextSyntaxes.OPT_DEFAULT))
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
254
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
255 syntaxes = TextSyntaxes.params_data['syntaxes']
1805
3c40fa0dcd7a pluging text syntaxes: various minor improvments:
Goffi <goffi@goffi.org>
parents: 1803
diff changeset
256 if name in syntaxes:
1811
90ae9a2462c2 plugin text syntaxes: fixes typo
souliane <souliane@mailoo.org>
parents: 1805
diff changeset
257 raise exceptions.ConflictError(u"This syntax name already exists: {}".format(name))
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
258 syntaxes[name] = {"to": to_xhtml_cb, "from": from_xhtml_cb, "flags": flags}
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
259 if TextSyntaxes.OPT_DEFAULT in flags:
1847
d8c6c55aafd7 plugin text syntaxes: fixed defaut syntax handling
Goffi <goffi@goffi.org>
parents: 1826
diff changeset
260 TextSyntaxes.params_data['default'] = name
665
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
261
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
262 self._updateParamOptions()
6a64e0a759e6 plugin text syntaxes: this plugin manage rich text syntaxes conversions and cleaning.
Goffi <goffi@goffi.org>
parents:
diff changeset
263
832
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
264 def _removeMarkups(self, xhtml):
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
265 """
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
266 Remove XHTML markups from the given string.
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
267 @param xhtml: the XHTML string to be cleaned
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
268 @return: the cleaned string
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
269 """
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
270 cleaner = clean.Cleaner(kill_tags=['style'])
c4b22aedb7d7 plugin groupblog, XEP-0071, XEP-0277, text_syntaxes: manage raw/rich/xhtml data for content/title:
souliane <souliane@mailoo.org>
parents: 811
diff changeset
271 cleaned = cleaner.clean_html(html.fromstring(xhtml))
852
4cc55e05266d plugin text syntaxes: fixed cleaners encoding
Goffi <goffi@goffi.org>
parents: 841
diff changeset
272 return html.tostring(cleaned, encoding=unicode, method="text")