annotate src/plugins/plugin_blog_import.py @ 2160:e67e8cd24141

core (tools/common): data objects first draft: this module aims is to help manipulate complex data from bridge, mainly for the template system. It is in common and not only in frontends as it may be used in some case by backend, if it needs to use template system in the future.
author Goffi <goffi@goffi.org>
date Tue, 21 Feb 2017 21:01:40 +0100
parents 33c8c4973743
children cdaa58e14553
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1934
2daf7b4c6756 use of /usr/bin/env instead of /usr/bin/python in shebang
Goffi <goffi@goffi.org>
parents: 1852
diff changeset
1 #!/usr/bin/env python2
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
2 # -*- coding: utf-8 -*-
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
3
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
4 # SàT plugin for import external blogs
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
5 # Copyright (C) 2009-2016 Jérôme Poisson (goffi@goffi.org)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
6
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
7 # This program is free software: you can redistribute it and/or modify
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
8 # it under the terms of the GNU Affero General Public License as published by
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
9 # the Free Software Foundation, either version 3 of the License, or
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
10 # (at your option) any later version.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
11
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
12 # This program is distributed in the hope that it will be useful,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
13 # but WITHOUT ANY WARRANTY; without even the implied warranty of
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
14 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
15 # GNU Affero General Public License for more details.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
16
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
17 # You should have received a copy of the GNU Affero General Public License
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
18 # along with this program. If not, see <http://www.gnu.org/licenses/>.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
19
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
20 from sat.core.i18n import _
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
21 from sat.core.constants import Const as C
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
22 from sat.core.log import getLogger
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
23 log = getLogger(__name__)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
24 from twisted.internet import defer
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
25 from twisted.web import client as web_client
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
26 from twisted.words.xish import domish
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
27 from sat.core import exceptions
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
28 from sat.tools import xml_tools
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
29 import collections
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
30 import os
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
31 import os.path
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
32 import tempfile
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
33 import urlparse
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
34 import uuid
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
35
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
36
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
37 PLUGIN_INFO = {
2145
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 2108
diff changeset
38 C.PI_NAME: "blog import",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 2108
diff changeset
39 C.PI_IMPORT_NAME: "BLOG_IMPORT",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 2108
diff changeset
40 C.PI_TYPE: C.PLUG_TYPE_BLOG,
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 2108
diff changeset
41 C.PI_DEPENDENCIES: ["XEP-0060", "XEP-0277", "TEXT-SYNTAXES", "UPLOAD"],
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 2108
diff changeset
42 C.PI_MAIN: "BlogImportPlugin",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 2108
diff changeset
43 C.PI_HANDLER: "no",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 2108
diff changeset
44 C.PI_DESCRIPTION: _(u"""Blog import management:
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
45 This plugin manage the different blog importers which can register to it, and handler generic importing tasks.""")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
46 }
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
47
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
48 OPT_HOST = 'host'
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
49 OPT_UPLOAD_IMAGES = 'upload_images'
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
50 OPT_UPLOAD_IGNORE_HOST = 'upload_ignore_host'
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
51 OPT_IGNORE_TLS = 'ignore_tls_errors'
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
52 URL_REDIRECT_PREFIX = 'url_redirect_'
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
53 BOOL_OPTIONS = (OPT_UPLOAD_IMAGES, OPT_IGNORE_TLS)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
54
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
55
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
56 BlogImporter = collections.namedtuple('BlogImporter', ('callback', 'short_desc', 'long_desc'))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
57
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
58
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
59 class BlogImportPlugin(object):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
60
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
61 def __init__(self, host):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
62 log.info(_("plugin Blog Import initialization"))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
63 self.host = host
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
64 self._importers = {}
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
65 self._u = host.plugins['UPLOAD']
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
66 self._p = host.plugins['XEP-0060']
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
67 self._m = host.plugins['XEP-0277']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
68 self._s = self.host.plugins['TEXT-SYNTAXES']
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
69 host.bridge.addMethod("blogImport", ".plugin", in_sign='ssa{ss}ss', out_sign='s', method=self._blogImport, async=True)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
70 host.bridge.addMethod("blogImportList", ".plugin", in_sign='', out_sign='a(ss)', method=self.listImporters)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
71 host.bridge.addMethod("blogImportDesc", ".plugin", in_sign='s', out_sign='(ss)', method=self.getDescription)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
72
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
73 def getProgress(self, progress_id, profile):
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
74 client = self.host.getClient(profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
75 return client._blogImport_progress[progress_id]
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
76
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
77 def listImporters(self):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
78 importers = self._importers.keys()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
79 importers.sort()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
80 return [(name, self._importers[name].short_desc) for name in self._importers]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
81
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
82 def getDescription(self, name):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
83 """Return import short and long descriptions
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
84
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
85 @param name(unicode): blog importer name
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
86 @return (tuple[unicode,unicode]): short and long description
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
87 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
88 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
89 importer = self._importers[name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
90 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
91 raise exceptions.NotFound(u"Blog importer not found [{}]".format(name))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
92 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
93 return importer.short_desc, importer.long_desc
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
94
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
95 def _blogImport(self, name, location, options, pubsub_service='', profile=C.PROF_KEY_DEFAULT):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
96 client = self.host.getClient(profile)
1839
cdecf553e051 frontends (jp/blog), plugin blog_import: fixes:
souliane <souliane@mailoo.org>
parents: 1831
diff changeset
97 options = {key: unicode(value) for key, value in options.iteritems()}
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
98 for option in BOOL_OPTIONS:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
99 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
100 options[option] = C.bool(options[option])
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
101 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
102 pass
1839
cdecf553e051 frontends (jp/blog), plugin blog_import: fixes:
souliane <souliane@mailoo.org>
parents: 1831
diff changeset
103 return self.blogImport(client, unicode(name), unicode(location), options)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
104
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
105 @defer.inlineCallbacks
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
106 def blogImport(self, client, name, location, options=None, pubsub_service=None):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
107 """Import a blog
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
108
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
109 @param name(unicode): name of the blog importer
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
110 @param location(unicode): location of the blog data to import
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
111 can be an url, a file path, or anything which make sense
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
112 check importer description for more details
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
113 @param options(dict, None): extra options. Below are the generic options,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
114 blog importer can have specific ones. All options have unicode values
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
115 generic options:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
116 - OPT_HOST (unicode): original host
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
117 - OPT_UPLOAD_IMAGES (bool): upload images to XMPP server if True
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
118 see OPT_UPLOAD_IGNORE_HOST.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
119 Default: True
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
120 - OPT_UPLOAD_IGNORE_HOST (unicode): don't upload images from this host
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
121 - OPT_IGNORE_TLS (bool): ignore TLS error for image upload.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
122 Default: False
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
123 @param pubsub_service(jid.JID, None): jid of the PubSub service where blog must be imported
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
124 None to use profile's server
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
125 @return (unicode): progress id
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
126 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
127 if options is None:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
128 options = {}
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
129 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
130 for opt_name, opt_default in ((OPT_UPLOAD_IMAGES, True),
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
131 (OPT_IGNORE_TLS, False)):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
132 # we want an filled options dict, with all empty or False values removed
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
133 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
134 value =options[opt_name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
135 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
136 if opt_default:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
137 options[opt_name] = opt_default
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
138 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
139 if not value:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
140 del options[opt_name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
141 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
142 importer = self._importers[name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
143 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
144 raise exceptions.NotFound(u"Importer [{}] not found".format(name))
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
145 posts_data, posts_count = yield importer.callback(client, location, options)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
146 url_redirect = {}
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
147 progress_id = unicode(uuid.uuid4())
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
148 try:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
149 progress_data = client._blogImport_progress
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
150 except AttributeError:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
151 progress_data = client._blogImport_progress = {}
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
152 progress_data[progress_id] = {u'position': '0'}
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
153 if posts_count is not None:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
154 progress_data[progress_id]['size'] = unicode(posts_count)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
155 metadata = {'name': u'{}: {}'.format(name, location),
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
156 'direction': 'out',
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
157 'type': 'BLOG_IMPORT'
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
158 }
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
159 self.host.registerProgressCb(progress_id, self.getProgress, metadata, profile=client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
160 self.host.bridge.progressStarted(progress_id, metadata, client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
161 self._recursiveImport(client, posts_data, progress_id, options, url_redirect)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
162 defer.returnValue(progress_id)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
163
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
164 @defer.inlineCallbacks
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
165 def _recursiveImport(self, client, posts_data, progress_id, options, url_redirect, service=None, node=None, depth=0):
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
166 """Do the upload recursively
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
167
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
168 @param posts_data(list): list of data as specified in [register]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
169 @param options(dict): import options
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
170 @param url_redirect(dict): link between former posts and new items
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
171 @param service(jid.JID, None): PubSub service to use
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
172 @param node(unicode, None): PubSub node to use
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
173 @param depth(int): level of recursion
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
174 """
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
175 for idx, data in enumerate(posts_data):
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
176 # data checks/filters
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
177 mb_data = data['blog']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
178 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
179 item_id = mb_data['id']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
180 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
181 item_id = mb_data['id'] = unicode(uuid.uuid4())
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
182
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
183 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
184 # we keep the link between old url and new blog item
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
185 # so the user can redirect its former blog urls
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
186 old_uri = data['url']
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
187 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
188 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
189 else:
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
190 new_uri = url_redirect[old_uri] = self._p.getNodeURI(
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
191 service if service is not None else client.jid.userhostJID(),
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
192 node or self._m.namespace,
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
193 item_id)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
194 log.info(u"url link from {old} to {new}".format(
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
195 old=old_uri, new=new_uri))
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
196
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
197 yield self.blogFilters(client, mb_data, options)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
198
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
199 # comments data
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
200 if len(data['comments']) != 1:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
201 raise NotImplementedError(u"can't manage multiple comment links")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
202 allow_comments = C.bool(mb_data.get('allow_comments', C.BOOL_FALSE))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
203 if allow_comments:
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
204 comments_service, comments_node = self._m.getCommentsService(client), self._m.getCommentsNode(item_id)
1852
569c48e4cf0d plugin blog import: mb_data handle unicode only, so comments_service must be a unicode and not a JID
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
205 mb_data['comments_service'] = comments_service.full()
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
206 mb_data['comments_node'] = comments_node
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
207 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
208 if data['comments'][0]:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
209 raise exceptions.DataError(u"allow_comments set to False, but comments are there")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
210
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
211 # post upload
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
212 depth or log.debug(u"uploading item [{id}]: {title}".format(id=mb_data['id'], title=mb_data.get('title','')))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
213 yield self._m.send(mb_data, service, node, profile=client.profile)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
214
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
215 # comments upload
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
216 depth or log.debug(u"uploading comments")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
217 if allow_comments:
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
218 yield self._recursiveImport(client, data['comments'][0], progress_id, options, url_redirect, service=comments_service, node=comments_node, depth=depth+1)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
219 if depth == 0:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
220 client._blogImport_progress[progress_id]['position'] = unicode(idx+1)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
221
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
222 if depth == 0:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
223 self.host.bridge.progressFinished(progress_id,
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
224 {u'{}{}'.format(URL_REDIRECT_PREFIX, old): new for old, new in url_redirect.iteritems()},
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
225 client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
226 self.host.removeProgressCb(progress_id, client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
227 del client._blogImport_progress[progress_id]
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
228
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
229 @defer.inlineCallbacks
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
230 def blogFilters(self, client, mb_data, options):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
231 """Apply filters according to options
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
232
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
233 modify mb_data in place
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
234 @param posts_data(list[dict]): data as returned by importer callback
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
235 @param options(dict): dict as given in [blogImport]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
236 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
237 # FIXME: blog filters don't work on text content
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
238 # TODO: text => XHTML conversion should handler links with <a/>
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
239 # filters can then be used by converting text to XHTML
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
240 if not options:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
241 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
242
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
243 # we want only XHTML content
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
244 for prefix in ('content',): # a tuple is use, if title need to be added in the future
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
245 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
246 rich = mb_data['{}_rich'.format(prefix)]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
247 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
248 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
249 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
250 if '{}_xhtml'.format(prefix) in mb_data:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
251 raise exceptions.DataError(u"importer gave {prefix}_rich and {prefix}_xhtml at the same time, this is not allowed".format(prefix=prefix))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
252 # we convert rich syntax to XHTML here, so we can handle filters easily
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
253 converted = yield self._s.convert(rich, self._s.getCurrentSyntax(client.profile), safe=False)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
254 mb_data['{}_xhtml'.format(prefix)] = converted
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
255 del mb_data['{}_rich'.format(prefix)]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
256
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
257 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
258 mb_data['txt']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
259 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
260 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
261 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
262 if '{}_xhtml'.format(prefix) in mb_data:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
263 log.warning(u"{prefix}_text will be replaced by converted {prefix}_xhtml, so filters can be handled".format(prefix=prefix))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
264 del mb_data['{}_text'.format(prefix)]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
265 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
266 log.warning(u"importer gave a text {prefix}, blog filters don't work on text {prefix}".format(prefix=prefix))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
267 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
268
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
269 # at this point, we have only XHTML version of content
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
270 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
271 top_elt = xml_tools.ElementParser()(mb_data['content_xhtml'], namespace=C.NS_XHTML)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
272 except domish.ParserError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
273 # we clean the xml and try again our luck
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
274 cleaned = yield self._s.cleanXHTML(mb_data['content_xhtml'])
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
275 top_elt = xml_tools.ElementParser()(cleaned, namespace=C.NS_XHTML)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
276 opt_host = options.get(OPT_HOST)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
277 if opt_host:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
278 # we normalise the domain
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
279 parsed_host = urlparse.urlsplit(opt_host)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
280 opt_host = urlparse.urlunsplit((parsed_host.scheme or 'http', parsed_host.netloc or parsed_host.path, '', '', ''))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
281
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
282 tmp_dir = tempfile.mkdtemp()
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
283 try:
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1839
diff changeset
284 # TODO: would be nice to also update the hyperlinks to these images, e.g. when you have <a href="{url}"><img src="{url}"></a>
2108
70f23bc7859b core (xml_tools): fixed findAll
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
285 for img_elt in xml_tools.findAll(top_elt, names=[u'img']):
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
286 yield self.imgFilters(client, img_elt, options, opt_host, tmp_dir)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
287 finally:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
288 os.rmdir(tmp_dir) # XXX: tmp_dir should be empty, or something went wrong
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
289
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
290 # we now replace the content with filtered one
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
291 mb_data['content_xhtml'] = top_elt.toXml()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
292
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
293 @defer.inlineCallbacks
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
294 def imgFilters(self, client, img_elt, options, opt_host, tmp_dir):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
295 """Filters handling images
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
296
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
297 url without host are fixed (if possible)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
298 according to options, images are uploaded to XMPP server
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
299 @param img_elt(domish.Element): <img/> element to handle
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
300 @param options(dict): filters options
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
301 @param opt_host(unicode): normalised host given in options
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
302 @param tmp_dir(str): path to temp directory
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
303 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
304 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
305 url = img_elt['src']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
306 if url[0] == u'/':
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
307 if not opt_host:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
308 log.warning(u"host was not specified, we can't deal with src without host ({url}) and have to ignore the following <img/>:\n{xml}"
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
309 .format(url=url, xml=img_elt.toXml()))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
310 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
311 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
312 url = urlparse.urljoin(opt_host, url)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
313 filename = url.rsplit('/',1)[-1].strip()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
314 if not filename:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
315 raise KeyError
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
316 except (KeyError, IndexError):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
317 log.warning(u"ignoring invalid img element: {}".format(img_elt.toXml()))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
318 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
319
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
320 # we change the url for the normalized one
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
321 img_elt['src'] = url
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
322
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
323 if options.get(OPT_UPLOAD_IMAGES, False):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
324 # upload is requested
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
325 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
326 ignore_host = options[OPT_UPLOAD_IGNORE_HOST]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
327 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
328 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
329 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
330 # host is the ignored one, we skip
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
331 parsed_url = urlparse.urlsplit(url)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
332 if ignore_host in parsed_url.hostname:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
333 log.info(u"Don't upload image at {url} because of {opt} option".format(
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
334 url=url, opt=OPT_UPLOAD_IGNORE_HOST))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
335 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
336
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
337 # we download images and re-upload them via XMPP
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
338 tmp_file = os.path.join(tmp_dir, filename).encode('utf-8')
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
339 upload_options = {'ignore_tls_errors': options.get(OPT_IGNORE_TLS, False)}
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
340
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
341 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
342 yield web_client.downloadPage(url.encode('utf-8'), tmp_file)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
343 filename = filename.replace(u'%', u'_') # FIXME: tmp workaround for a bug in prosody http upload
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
344 dummy, download_d = yield self._u.upload(client, tmp_file, filename, options=upload_options)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
345 download_url = yield download_d
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
346 except Exception as e:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
347 log.warning(u"can't download image at {url}: {reason}".format(url=url, reason=e))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
348 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
349 img_elt['src'] = download_url
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
350
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
351 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
352 os.unlink(tmp_file)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
353 except OSError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
354 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
355
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
356 def register(self, name, callback, short_desc='', long_desc=''):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
357 """Register a blogImport method
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
358
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
359 @param name(unicode): unique importer name, should indicate the blogging software it handler and always lowercase
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
360 @param callback(callable): method to call:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
361 the signature must be (client, location, options) (cf. [blogImport])
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
362 the importer must return a tuple with (posts_data, posts_count)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
363
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
364 posts_data is an iterable of dict which must have the following keys:
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
365 'blog' (dict): microblog data of the blog post (cf. http://wiki.goffi.org/wiki/Bridge_API_-_Microblogging/en)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
366 the importer MUST NOT create node or call XEP-0277 plugin itself
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
367 'comments*' key MUST NOT be used in this microblog_data, see bellow for comments
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
368 It is recommanded to use a unique id in the "id" key which is constant per blog item,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
369 so if the import fail, a new import will overwrite the failed items and avoid duplicates.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
370
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
371 'comments' (list[list[dict]],None): Dictionaries must have the same keys as main item (i.e. 'blog' and 'comments')
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
372 a list of list is used because XEP-0277 can handler several comments nodes,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
373 but in most cases, there will we only one item it the first list (something like [[{comment1_data},{comment2_data}, ...]])
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
374 blog['allow_comments'] must be True if there is any comment, and False (or not present) if comments are not allowed.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
375 If allow_comments is False and some comments are present, a exceptions.DataError will be raised
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
376 the import MAY optionally have the following keys:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
377 'url' (unicode): former url of the post (only the path, without host part)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
378 if present the association to the new path will be displayed to user, so it can make redirections if necessary
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
379
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
380 posts_count (int, None) indicate the total number of posts (without comments)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
381 useful to display a progress indicator when the iterator is a generator
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
382 use None if you can't guess the total number of blog posts
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
383 @param short_desc(unicode): one line description of the importer
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
384 @param long_desc(unicode): long description of the importer, its options, etc.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
385 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
386 name = name.lower()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
387 if name in self._importers:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
388 raise exceptions.ConflictError(u"A blog importer with the name {} already exsit".format(name))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
389 self._importers[name] = BlogImporter(callback, short_desc, long_desc)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
390
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
391 def unregister(self, name):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
392 del self._importers[name]