annotate src/plugins/plugin_blog_import.py @ 1831:68c0dc13d821

plugin blog import, XEP-0277: progress + redirect: - progression is now handled - url redirections are handled with PubSub URIs, and returned as metadata with progressFinished - tmp_dir is cleaned in a finally close
author Goffi <goffi@goffi.org>
date Sat, 23 Jan 2016 20:01:28 +0100
parents 4e51f21c687f
children cdecf553e051
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
1 #!/usr/bin/python
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
2 # -*- coding: utf-8 -*-
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
3
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
4 # SàT plugin for import external blogs
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
5 # Copyright (C) 2009-2016 Jérôme Poisson (goffi@goffi.org)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
6
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
7 # This program is free software: you can redistribute it and/or modify
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
8 # it under the terms of the GNU Affero General Public License as published by
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
9 # the Free Software Foundation, either version 3 of the License, or
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
10 # (at your option) any later version.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
11
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
12 # This program is distributed in the hope that it will be useful,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
13 # but WITHOUT ANY WARRANTY; without even the implied warranty of
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
14 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
15 # GNU Affero General Public License for more details.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
16
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
17 # You should have received a copy of the GNU Affero General Public License
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
18 # along with this program. If not, see <http://www.gnu.org/licenses/>.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
19
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
20 from sat.core.i18n import _
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
21 from sat.core.constants import Const as C
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
22 from sat.core.log import getLogger
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
23 log = getLogger(__name__)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
24 from twisted.internet import defer
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
25 from twisted.web import client as web_client
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
26 from twisted.words.xish import domish
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
27 from sat.core import exceptions
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
28 from sat.tools import xml_tools
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
29 import collections
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
30 import os
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
31 import os.path
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
32 import tempfile
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
33 import urlparse
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
34 import uuid
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
35
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
36
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
37 PLUGIN_INFO = {
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
38 "name": "blog import",
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
39 "import_name": "BLOG_IMPORT",
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
40 "type": C.PLUG_TYPE_BLOG,
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
41 "dependencies": ["XEP-0060", "XEP-0277", "TEXT-SYNTAXES", "UPLOAD"],
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
42 "main": "BlogImportPlugin",
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
43 "handler": "no",
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
44 "description": _(u"""Blog import management:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
45 This plugin manage the different blog importers which can register to it, and handler generic importing tasks.""")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
46 }
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
47
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
48 OPT_HOST = 'host'
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
49 OPT_UPLOAD_IMAGES = 'upload_images'
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
50 OPT_UPLOAD_IGNORE_HOST = 'upload_ignore_host'
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
51 OPT_IGNORE_TLS = 'ignore_tls_errors'
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
52 URL_REDIRECT_PREFIX = 'url_redirect_'
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
53 BOOL_OPTIONS = (OPT_UPLOAD_IMAGES, OPT_IGNORE_TLS)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
54
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
55
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
56 BlogImporter = collections.namedtuple('BlogImporter', ('callback', 'short_desc', 'long_desc'))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
57
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
58
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
59 class BlogImportPlugin(object):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
60
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
61 def __init__(self, host):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
62 log.info(_("plugin Blog Import initialization"))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
63 self.host = host
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
64 self._importers = {}
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
65 self._u = host.plugins['UPLOAD']
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
66 self._p = host.plugins['XEP-0060']
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
67 self._m = host.plugins['XEP-0277']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
68 self._s = self.host.plugins['TEXT-SYNTAXES']
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
69 host.bridge.addMethod("blogImport", ".plugin", in_sign='ssa{ss}ss', out_sign='s', method=self._blogImport, async=True)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
70 host.bridge.addMethod("blogImportList", ".plugin", in_sign='', out_sign='a(ss)', method=self.listImporters)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
71 host.bridge.addMethod("blogImportDesc", ".plugin", in_sign='s', out_sign='(ss)', method=self.getDescription)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
72
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
73 def getProgress(self, progress_id, profile):
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
74 client = self.host.getClient(profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
75 return client._blogImport_progress[progress_id]
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
76
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
77 def listImporters(self):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
78 importers = self._importers.keys()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
79 importers.sort()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
80 return [(name, self._importers[name].short_desc) for name in self._importers]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
81
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
82 def getDescription(self, name):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
83 """Return import short and long descriptions
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
84
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
85 @param name(unicode): blog importer name
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
86 @return (tuple[unicode,unicode]): short and long description
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
87 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
88 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
89 importer = self._importers[name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
90 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
91 raise exceptions.NotFound(u"Blog importer not found [{}]".format(name))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
92 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
93 return importer.short_desc, importer.long_desc
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
94
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
95 def _blogImport(self, name, location, options, pubsub_service='', profile=C.PROF_KEY_DEFAULT):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
96 client = self.host.getClient(profile)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
97 for option in BOOL_OPTIONS:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
98 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
99 options[option] = C.bool(options[option])
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
100 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
101 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
102 return self.blogImport(client, name, location, options)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
103
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
104 @defer.inlineCallbacks
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
105 def blogImport(self, client, name, location, options=None, pubsub_service=None):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
106 """Import a blog
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
107
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
108 @param name(unicode): name of the blog importer
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
109 @param location(unicode): location of the blog data to import
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
110 can be an url, a file path, or anything which make sense
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
111 check importer description for more details
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
112 @param options(dict, None): extra options. Below are the generic options,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
113 blog importer can have specific ones. All options have unicode values
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
114 generic options:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
115 - OPT_HOST (unicode): original host
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
116 - OPT_UPLOAD_IMAGES (bool): upload images to XMPP server if True
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
117 see OPT_UPLOAD_IGNORE_HOST.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
118 Default: True
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
119 - OPT_UPLOAD_IGNORE_HOST (unicode): don't upload images from this host
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
120 - OPT_IGNORE_TLS (bool): ignore TLS error for image upload.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
121 Default: False
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
122 @param pubsub_service(jid.JID, None): jid of the PubSub service where blog must be imported
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
123 None to use profile's server
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
124 @return (unicode): progress id
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
125 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
126 if options is None:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
127 options = {}
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
128 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
129 for opt_name, opt_default in ((OPT_UPLOAD_IMAGES, True),
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
130 (OPT_IGNORE_TLS, False)):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
131 # we want an filled options dict, with all empty or False values removed
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
132 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
133 value =options[opt_name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
134 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
135 if opt_default:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
136 options[opt_name] = opt_default
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
137 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
138 if not value:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
139 del options[opt_name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
140 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
141 importer = self._importers[name]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
142 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
143 raise exceptions.NotFound(u"Importer [{}] not found".format(name))
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
144 posts_data, posts_count = yield importer.callback(client, location, options)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
145 url_redirect = {}
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
146 progress_id = unicode(uuid.uuid4())
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
147 try:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
148 progress_data = client._blogImport_progress
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
149 except AttributeError:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
150 progress_data = client._blogImport_progress = {}
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
151 progress_data[progress_id] = {u'position': '0'}
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
152 if posts_count is not None:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
153 progress_data[progress_id]['size'] = unicode(posts_count)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
154 metadata = {'name': u'{}: {}'.format(name, location),
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
155 'direction': 'out',
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
156 'type': 'BLOG_IMPORT'
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
157 }
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
158 self.host.registerProgressCb(progress_id, self.getProgress, metadata, profile=client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
159 self.host.bridge.progressStarted(progress_id, metadata, client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
160 self._recursiveImport(client, posts_data, progress_id, options, url_redirect)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
161 defer.returnValue(progress_id)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
162
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
163 @defer.inlineCallbacks
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
164 def _recursiveImport(self, client, posts_data, progress_id, options, url_redirect, service=None, node=None, depth=0):
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
165 """Do the upload recursively
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
166
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
167 @param posts_data(list): list of data as specified in [register]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
168 @param options(dict): import options
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
169 @param url_redirect(dict): link between former posts and new items
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
170 @param service(jid.JID, None): PubSub service to use
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
171 @param node(unicode, None): PubSub node to use
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
172 @param depth(int): level of recursion
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
173 """
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
174 for idx, data in enumerate(posts_data):
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
175 # data checks/filters
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
176 mb_data = data['blog']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
177 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
178 item_id = mb_data['id']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
179 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
180 item_id = mb_data['id'] = unicode(uuid.uuid4())
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
181
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
182 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
183 # we keep the link between old url and new blog item
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
184 # so the user can redirect its former blog urls
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
185 old_uri = data['url']
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
186 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
187 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
188 else:
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
189 new_uri = url_redirect[old_uri] = self._p.getNodeURI(
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
190 service if service is not None else client.jid.userhostJID(),
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
191 node or self._m.namespace,
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
192 item_id)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
193 log.info(u"url link from {old} to {new}".format(
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
194 old=old_uri, new=new_uri))
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
195
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
196 yield self.blogFilters(client, mb_data, options)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
197
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
198 # comments data
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
199 if len(data['comments']) != 1:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
200 raise NotImplementedError(u"can't manage multiple comment links")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
201 allow_comments = C.bool(mb_data.get('allow_comments', C.BOOL_FALSE))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
202 if allow_comments:
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
203 comments_service, comments_node = self._m.getCommentsService(client), self._m.getCommentsNode(item_id)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
204 mb_data['comments_service'] = comments_service
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
205 mb_data['comments_node'] = comments_node
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
206 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
207 if data['comments'][0]:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
208 raise exceptions.DataError(u"allow_comments set to False, but comments are there")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
209
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
210 # post upload
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
211 depth or log.debug(u"uploading item [{id}]: {title}".format(id=mb_data['id'], title=mb_data.get('title','')))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
212 yield self._m.send(mb_data, service, node, profile=client.profile)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
213
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
214 # comments upload
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
215 depth or log.debug(u"uploading comments")
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
216 if allow_comments:
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
217 yield self._recursiveImport(client, data['comments'][0], progress_id, options, url_redirect, service=comments_service, node=comments_node, depth=depth+1)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
218 if depth == 0:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
219 client._blogImport_progress[progress_id]['position'] = unicode(idx+1)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
220
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
221 if depth == 0:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
222 self.host.bridge.progressFinished(progress_id,
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
223 {u'{}{}'.format(URL_REDIRECT_PREFIX, old): new for old, new in url_redirect.iteritems()},
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
224 client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
225 self.host.removeProgressCb(progress_id, client.profile)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
226 del client._blogImport_progress[progress_id]
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
227
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
228 @defer.inlineCallbacks
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
229 def blogFilters(self, client, mb_data, options):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
230 """Apply filters according to options
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
231
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
232 modify mb_data in place
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
233 @param posts_data(list[dict]): data as returned by importer callback
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
234 @param options(dict): dict as given in [blogImport]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
235 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
236 # FIXME: blog filters don't work on text content
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
237 # TODO: text => XHTML conversion should handler links with <a/>
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
238 # filters can then be used by converting text to XHTML
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
239 if not options:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
240 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
241
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
242 # we want only XHTML content
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
243 for prefix in ('content',): # a tuple is use, if title need to be added in the future
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
244 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
245 rich = mb_data['{}_rich'.format(prefix)]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
246 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
247 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
248 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
249 if '{}_xhtml'.format(prefix) in mb_data:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
250 raise exceptions.DataError(u"importer gave {prefix}_rich and {prefix}_xhtml at the same time, this is not allowed".format(prefix=prefix))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
251 # we convert rich syntax to XHTML here, so we can handle filters easily
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
252 converted = yield self._s.convert(rich, self._s.getCurrentSyntax(client.profile), safe=False)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
253 mb_data['{}_xhtml'.format(prefix)] = converted
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
254 del mb_data['{}_rich'.format(prefix)]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
255
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
256 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
257 mb_data['txt']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
258 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
259 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
260 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
261 if '{}_xhtml'.format(prefix) in mb_data:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
262 log.warning(u"{prefix}_text will be replaced by converted {prefix}_xhtml, so filters can be handled".format(prefix=prefix))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
263 del mb_data['{}_text'.format(prefix)]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
264 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
265 log.warning(u"importer gave a text {prefix}, blog filters don't work on text {prefix}".format(prefix=prefix))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
266 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
267
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
268 # at this point, we have only XHTML version of content
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
269 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
270 top_elt = xml_tools.ElementParser()(mb_data['content_xhtml'], namespace=C.NS_XHTML)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
271 except domish.ParserError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
272 # we clean the xml and try again our luck
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
273 cleaned = yield self._s.cleanXHTML(mb_data['content_xhtml'])
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
274 top_elt = xml_tools.ElementParser()(cleaned, namespace=C.NS_XHTML)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
275 opt_host = options.get(OPT_HOST)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
276 if opt_host:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
277 # we normalise the domain
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
278 parsed_host = urlparse.urlsplit(opt_host)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
279 opt_host = urlparse.urlunsplit((parsed_host.scheme or 'http', parsed_host.netloc or parsed_host.path, '', '', ''))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
280
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
281 tmp_dir = tempfile.mkdtemp()
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
282 try:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
283 for img_elt in xml_tools.findAll(top_elt, ['img']):
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
284 yield self.imgFilters(client, img_elt, options, opt_host, tmp_dir)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
285 finally:
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
286 os.rmdir(tmp_dir) # XXX: tmp_dir should be empty, or something went wrong
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
287
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
288 # we now replace the content with filtered one
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
289 mb_data['content_xhtml'] = top_elt.toXml()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
290
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
291 @defer.inlineCallbacks
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
292 def imgFilters(self, client, img_elt, options, opt_host, tmp_dir):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
293 """Filters handling images
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
294
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
295 url without host are fixed (if possible)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
296 according to options, images are uploaded to XMPP server
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
297 @param img_elt(domish.Element): <img/> element to handle
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
298 @param options(dict): filters options
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
299 @param opt_host(unicode): normalised host given in options
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
300 @param tmp_dir(str): path to temp directory
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
301 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
302 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
303 url = img_elt['src']
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
304 if url[0] == u'/':
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
305 if not opt_host:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
306 log.warning(u"host was not specified, we can't deal with src without host ({url}) and have to ignore the following <img/>:\n{xml}"
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
307 .format(url=url, xml=img_elt.toXml()))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
308 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
309 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
310 url = urlparse.urljoin(opt_host, url)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
311 filename = url.rsplit('/',1)[-1].strip()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
312 if not filename:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
313 raise KeyError
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
314 except (KeyError, IndexError):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
315 log.warning(u"ignoring invalid img element: {}".format(img_elt.toXml()))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
316 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
317
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
318 # we change the url for the normalized one
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
319 img_elt['src'] = url
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
320
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
321 if options.get(OPT_UPLOAD_IMAGES, False):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
322 # upload is requested
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
323 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
324 ignore_host = options[OPT_UPLOAD_IGNORE_HOST]
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
325 except KeyError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
326 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
327 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
328 # host is the ignored one, we skip
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
329 parsed_url = urlparse.urlsplit(url)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
330 if ignore_host in parsed_url.hostname:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
331 log.info(u"Don't upload image at {url} because of {opt} option".format(
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
332 url=url, opt=OPT_UPLOAD_IGNORE_HOST))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
333 return
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
334
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
335 # we download images and re-upload them via XMPP
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
336 tmp_file = os.path.join(tmp_dir, filename).encode('utf-8')
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
337 upload_options = {'ignore_tls_errors': options.get(OPT_IGNORE_TLS, False)}
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
338
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
339 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
340 yield web_client.downloadPage(url.encode('utf-8'), tmp_file)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
341 filename = filename.replace(u'%', u'_') # FIXME: tmp workaround for a bug in prosody http upload
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
342 dummy, download_d = yield self._u.upload(client, tmp_file, filename, options=upload_options)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
343 download_url = yield download_d
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
344 except Exception as e:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
345 log.warning(u"can't download image at {url}: {reason}".format(url=url, reason=e))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
346 else:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
347 img_elt['src'] = download_url
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
348
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
349 try:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
350 os.unlink(tmp_file)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
351 except OSError:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
352 pass
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
353
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
354 def register(self, name, callback, short_desc='', long_desc=''):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
355 """Register a blogImport method
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
356
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
357 @param name(unicode): unique importer name, should indicate the blogging software it handler and always lowercase
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
358 @param callback(callable): method to call:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
359 the signature must be (client, location, options) (cf. [blogImport])
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
360 the importer must return a tuple with (posts_data, posts_count)
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
361
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
362 posts_data is an iterable of dict which must have the following keys:
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
363 'blog' (dict): microblog data of the blog post (cf. http://wiki.goffi.org/wiki/Bridge_API_-_Microblogging/en)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
364 the importer MUST NOT create node or call XEP-0277 plugin itself
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
365 'comments*' key MUST NOT be used in this microblog_data, see bellow for comments
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
366 It is recommanded to use a unique id in the "id" key which is constant per blog item,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
367 so if the import fail, a new import will overwrite the failed items and avoid duplicates.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
368
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
369 'comments' (list[list[dict]],None): Dictionaries must have the same keys as main item (i.e. 'blog' and 'comments')
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
370 a list of list is used because XEP-0277 can handler several comments nodes,
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
371 but in most cases, there will we only one item it the first list (something like [[{comment1_data},{comment2_data}, ...]])
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
372 blog['allow_comments'] must be True if there is any comment, and False (or not present) if comments are not allowed.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
373 If allow_comments is False and some comments are present, a exceptions.DataError will be raised
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
374 the import MAY optionally have the following keys:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
375 'url' (unicode): former url of the post (only the path, without host part)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
376 if present the association to the new path will be displayed to user, so it can make redirections if necessary
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
377
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
378 posts_count (int, None) indicate the total number of posts (without comments)
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
379 useful to display a progress indicator when the iterator is a generator
1831
68c0dc13d821 plugin blog import, XEP-0277: progress + redirect:
Goffi <goffi@goffi.org>
parents: 1825
diff changeset
380 use None if you can't guess the total number of blog posts
1825
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
381 @param short_desc(unicode): one line description of the importer
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
382 @param long_desc(unicode): long description of the importer, its options, etc.
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
383 """
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
384 name = name.lower()
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
385 if name in self._importers:
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
386 raise exceptions.ConflictError(u"A blog importer with the name {} already exsit".format(name))
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
387 self._importers[name] = BlogImporter(callback, short_desc, long_desc)
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
388
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
389 def unregister(self, name):
4e51f21c687f plugin blog import: this plugin is the base handling blog importers:
Goffi <goffi@goffi.org>
parents:
diff changeset
390 del self._importers[name]