annotate src/plugins/plugin_blog_import_dokuwiki.py @ 2371:2268df8c99bf

plugin pubsub schema: values handling: - "schema" attribute can now be specified directly as a data_form.Form (reparsing it will be avoided in this case) - new "deserialise" parameter allows deserialisation from unicode strings in sendDataFormItem - values can now be an iterable or directly the value to use (it will be put in a list automatically in the later case) - "id" is now put directly in XMLUI's "id" field instea of "_id"
author Goffi <goffi@goffi.org>
date Fri, 06 Oct 2017 10:55:51 +0200
parents 33c8c4973743
children 8b37a62336c3
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1934
2daf7b4c6756 use of /usr/bin/env instead of /usr/bin/python in shebang
Goffi <goffi@goffi.org>
parents: 1853
diff changeset
1 #!/usr/bin/env python2
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
2 # -*- coding: utf-8 -*-
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
3
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
4 # SàT plugin to import dokuwiki blogs
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
5 # Copyright (C) 2009-2016 Jérôme Poisson (goffi@goffi.org)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
6 # Copyright (C) 2013-2016 Adrien Cossa (souliane@mailoo.org)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
7
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
8 # This program is free software: you can redistribute it and/or modify
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
9 # it under the terms of the GNU Affero General Public License as published by
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
10 # the Free Software Foundation, either version 3 of the License, or
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
11 # (at your option) any later version.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
12
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
13 # This program is distributed in the hope that it will be useful,
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
14 # but WITHOUT ANY WARRANTY; without even the implied warranty of
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
15 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
16 # GNU Affero General Public License for more details.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
17
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
18 # You should have received a copy of the GNU Affero General Public License
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
19 # along with this program. If not, see <http://www.gnu.org/licenses/>.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
20
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
21 from sat.core.i18n import _, D_
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
22 from sat.core.constants import Const as C
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
23 from sat.core.log import getLogger
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
24 log = getLogger(__name__)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
25 from sat.core import exceptions
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
26 from sat.tools import xml_tools
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
27 from twisted.internet import threads
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
28 from collections import OrderedDict
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
29 import calendar
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
30 import urllib
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
31 import urlparse
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
32 import tempfile
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
33 import re
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
34 import time
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
35 import os.path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
36 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
37 from dokuwiki import DokuWiki, DokuWikiError # this is a new dependency
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
38 except ImportError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
39 raise exceptions.MissingModule(u'Missing module dokuwiki, please install it with "pip install dokuwiki"')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
40 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
41 from PIL import Image # this is already needed by plugin XEP-0054
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
42 except:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
43 raise exceptions.MissingModule(u"Missing module pillow, please download/install it from https://python-pillow.github.io")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
44
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
45 PLUGIN_INFO = {
2145
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
46 C.PI_NAME: "Dokuwiki import",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
47 C.PI_IMPORT_NAME: "IMPORT_DOKUWIKI",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
48 C.PI_TYPE: C.PLUG_TYPE_BLOG,
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
49 C.PI_DEPENDENCIES: ["BLOG_IMPORT"],
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
50 C.PI_MAIN: "DokuwikiImport",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
51 C.PI_HANDLER: "no",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
52 C.PI_DESCRIPTION: _("""Blog importer for Dokuwiki blog engine.""")
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
53 }
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
54
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
55 SHORT_DESC = D_(u"import posts from Dokuwiki blog engine")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
56
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
57 LONG_DESC = D_(u"""This importer handle Dokuwiki blog engine.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
58
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
59 To use it, you need an admin access to a running Dokuwiki website
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
60 (local or on the Internet). The importer retrieves the data using
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
61 the XMLRPC Dokuwiki API.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
62
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
63 You can specify a namespace (that could be a namespace directory
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
64 or a single post) or leave it empty to use the root namespace "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
65 and import all the posts.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
66
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
67 You can specify a new media repository to modify the internal
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
68 media links and make them point to the URL of your choice, but
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
69 note that the upload is not done automatically: a temporary
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
70 directory will be created on your local drive and you will
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
71 need to upload it yourself to your repository via SSH or FTP.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
72
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
73 Following options are recognized:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
74
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
75 location: DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
76 user: DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
77 passwd: DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
78 namespace: DokuWiki namespace to import (default: root namespace "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
79 media_repo: URL to the new remote media repository (default: none)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
80 limit: maximal number of posts to import (default: 100)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
81
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
82 Example of usage (with jp frontend):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
83
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
84 jp import dokuwiki -p dave --pwd xxxxxx --connect
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
85 http://127.0.1.1 -o user souliane -o passwd qwertz
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
86 -o namespace public:2015:10
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
87 -o media_repo http://media.diekulturvermittlung.at
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
88
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
89 This retrieves the 100 last blog posts from http://127.0.1.1 that
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
90 are inside the namespace "public:2015:10" using the Dokuwiki user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
91 "souliane", and it imports them to sat profile dave's microblog node.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
92 Internal Dokuwiki media that were hosted on http://127.0.1.1 are now
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
93 pointing to http://media.diekulturvermittlung.at.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
94 """)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
95 DEFAULT_MEDIA_REPO = ""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
96 DEFAULT_NAMESPACE = "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
97 DEFAULT_LIMIT = 100 # you might get a DBUS timeout (no reply) if it lasts too long
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
98
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
99
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
100 class Importer(DokuWiki):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
101
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
102 def __init__(self, url, user, passwd, media_repo=DEFAULT_MEDIA_REPO, limit=DEFAULT_LIMIT):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
103 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
104
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
105 @param url (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
106 @param user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
107 @param passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
108 @param media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
109 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
110 DokuWiki.__init__(self, url, user, passwd)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
111 self.url = url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
112 self.media_repo = media_repo
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
113 self.temp_dir = tempfile.mkdtemp() if self.media_repo else None
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
114 self.limit = limit
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
115 self.posts_data = OrderedDict()
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
116
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
117 def getPostId(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
118 """Return a unique and constant post id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
119
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
120 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
121 @return (unicode): post unique item id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
122 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
123 return unicode(post['id'])
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
124
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
125 def getPostUpdated(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
126 """Return the update date.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
127
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
128 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
129 @return (unicode): update date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
130 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
131 return unicode(post['mtime'])
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
132
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
133 def getPostPublished(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
134 """Try to parse the date from the message ID, else use "mtime".
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
135
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
136 The date can be extracted if the message ID looks like one of:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
137 - namespace:YYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
138 - namespace:YYYYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
139 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
140 @return (unicode): publication date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
141 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
142 id_, default = unicode(post["id"]), unicode(post["mtime"])
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
143 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
144 date = id_.split(":")[-1].split("_")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
145 except KeyError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
146 return default
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
147 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
148 time_struct = time.strptime(date, "%y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
149 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
150 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
151 time_struct = time.strptime(date, "%Y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
152 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
153 return default
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
154 return unicode(calendar.timegm(time_struct))
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
155
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
156 def processPost(self, post, profile_jid):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
157 """Process a single page.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
158
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
159 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
160 @param profile_jid
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
161 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
162 # get main information
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
163 id_ = self.getPostId(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
164 updated = self.getPostUpdated(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
165 published = self.getPostPublished(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
166
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
167 # manage links
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
168 backlinks = self.pages.backlinks(id_)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
169 for link in self.pages.links(id_):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
170 if link["type"] != "extern":
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
171 assert link["type"] == "local"
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
172 page = link["page"]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
173 backlinks.append(page[1:] if page.startswith(":") else page)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
174
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
175 self.pages.get(id_)
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
176 content_xhtml = self.processContent(self.pages.html(id_), backlinks, profile_jid)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
177
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
178 # XXX: title is already in content_xhtml and difficult to remove, so leave it
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
179 # title = content.split("\n")[0].strip(u"\ufeff= ")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
180
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
181 # build the extra data dictionary
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
182 mb_data = {"id": id_,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
183 "published": published,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
184 "updated": updated,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
185 "author": profile_jid.user,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
186 # "content": content, # when passed, it is displayed in Libervia instead of content_xhtml
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
187 "content_xhtml": content_xhtml,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
188 # "title": title,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
189 "allow_comments": "true",
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
190 }
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
191
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
192 # find out if the message access is public or restricted
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
193 namespace = id_.split(":")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
194 if namespace and namespace.lower() not in ("public", "/"):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
195 mb_data["group"] = namespace # roster group must exist
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
196
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
197 self.posts_data[id_] = {'blog': mb_data, 'comments':[[]]}
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
198
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
199 def process(self, client, namespace=DEFAULT_NAMESPACE):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
200 """Process a namespace or a single page.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
201
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
202 @param namespace (unicode): DokuWiki namespace (or page) to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
203 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
204 profile_jid = client.jid
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
205 log.info("Importing data from DokuWiki %s" % self.version)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
206 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
207 pages_list = self.pages.list(namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
208 except DokuWikiError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
209 log.warning('Could not list Dokuwiki pages: please turn the "display_errors" setting to "Off" in the php.ini of the webserver hosting DokuWiki.')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
210 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
211
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
212 if not pages_list: # namespace is actually a page?
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
213 names = namespace.split(":")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
214 real_namespace = ":".join(names[0:-1])
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
215 pages_list = self.pages.list(real_namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
216 pages_list = [page for page in pages_list if page["id"] == namespace]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
217 namespace = real_namespace
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
218
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
219 count = 0
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
220 for page in pages_list:
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
221 self.processPost(page, profile_jid)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
222 count += 1
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
223 if count >= self.limit :
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
224 break
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
225
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
226 return (self.posts_data.itervalues(), len(self.posts_data))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
227
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
228 def processContent(self, text, backlinks, profile_jid):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
229 """Do text substitutions and file copy.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
230
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
231 @param text (unicode): message content
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
232 @param backlinks (list[unicode]): list of backlinks
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
233 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
234 text = text.strip(u"\ufeff") # this is at the beginning of the file (BOM)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
235
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
236 for backlink in backlinks:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
237 src = '/doku.php?id=%s"' % backlink
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
238 tgt = '/blog/%s/%s" target="#"' % (profile_jid.user, backlink)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
239 text = text.replace(src, tgt)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
240
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
241 subs = {}
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
242
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
243 link_pattern = r"""<(img|a)[^>]* (src|href)="([^"]+)"[^>]*>"""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
244 for tag in re.finditer(link_pattern, text):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
245 type_, attr, link = tag.group(1), tag.group(2), tag.group(3)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
246 assert (type_ == "img" and attr == "src") or (type_ == "a" and attr == "href")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
247 if re.match(r"^\w*://", link): # absolute URL to link directly
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
248 continue
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
249 if self.media_repo:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
250 self.moveMedia(link, subs)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
251 elif link not in subs:
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
252 subs[link] = urlparse.urljoin(self.url, link)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
253
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
254 for url, new_url in subs.iteritems():
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
255 text = text.replace(url, new_url)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
256 return text
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
257
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
258 def moveMedia(self, link, subs):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
259 """Move a media from the DokuWiki host to the new repository.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
260
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
261 This also updates the hyperlinks to internal media files.
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
262 @param link (unicode): media link
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
263 @param subs (dict): substitutions data
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
264 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
265 url = urlparse.urljoin(self.url, link)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
266 user_media = re.match(r"(/lib/exe/\w+.php\?)(.*)", link)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
267 thumb_width = None
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
268
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
269 if user_media: # media that has been added by the user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
270 params = urlparse.parse_qs(urlparse.urlparse(url).query)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
271 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
272 media = params["media"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
273 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
274 log.warning("No media found in fetch URL: %s" % user_media.group(2))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
275 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
276 if re.match(r"^\w*://", media): # external URL to link directly
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
277 subs[link] = media
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
278 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
279 try: # create thumbnail
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
280 thumb_width = params["w"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
281 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
282 pass
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
283
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
284 filename = media.replace(":", "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
285 # XXX: avoid "precondition failed" error (only keep the media parameter)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
286 url = urlparse.urljoin(self.url, "/lib/exe/fetch.php?media=%s" % media)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
287
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
288 elif link.startswith("/lib/plugins/"):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
289 # other link added by a plugin or something else
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
290 filename = link[13:]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
291 else: # fake alert... there's no media (or we don't handle it yet)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
292 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
293
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
294 filepath = os.path.join(self.temp_dir, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
295 self.downloadMedia(url, filepath)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
296
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
297 if thumb_width:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
298 filename = os.path.join("thumbs", thumb_width, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
299 thumbnail = os.path.join(self.temp_dir, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
300 self.createThumbnail(filepath, thumbnail, thumb_width)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
301
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
302 new_url = os.path.join(self.media_repo, filename)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
303 subs[link] = new_url
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
304
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
305 def downloadMedia(self, source, dest):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
306 """Copy media to localhost.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
307
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
308 @param source (unicode): source url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
309 @param dest (unicode): target path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
310 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
311 dirname = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
312 if not os.path.exists(dest):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
313 if not os.path.exists(dirname):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
314 os.makedirs(dirname)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
315 urllib.urlretrieve(source, dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
316 log.debug("DokuWiki media file copied to %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
317
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
318 def createThumbnail(self, source, dest, width):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
319 """Create a thumbnail.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
320
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
321 @param source (unicode): source file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
322 @param dest (unicode): destination file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
323 @param width (unicode): thumbnail's width
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
324 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
325 thumb_dir = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
326 if not os.path.exists(thumb_dir):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
327 os.makedirs(thumb_dir)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
328 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
329 im = Image.open(source)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
330 im.thumbnail((width, int(width) * im.size[0] / im.size[1]))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
331 im.save(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
332 log.debug("DokuWiki media thumbnail created: %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
333 except IOError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
334 log.error("Cannot create DokuWiki media thumbnail %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
335
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
336
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
337
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
338 class DokuwikiImport(object):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
339
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
340 def __init__(self, host):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
341 log.info(_("plugin Dokuwiki Import initialization"))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
342 self.host = host
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
343 self._blog_import = host.plugins['BLOG_IMPORT']
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
344 self._blog_import.register('dokuwiki', self.DkImport, SHORT_DESC, LONG_DESC)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
345
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
346 def DkImport(self, client, location, options=None):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
347 """Import from DokuWiki to PubSub
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
348
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
349 @param location (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
350 @param options (dict, None): DokuWiki import parameters
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
351 - user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
352 - passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
353 - namespace (unicode): DokuWiki namespace to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
354 - media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
355 """
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
356 options[self._blog_import.OPT_HOST] = location
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
357 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
358 user = options["user"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
359 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
360 raise exceptions.DataError('parameter "user" is required')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
361 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
362 passwd = options["passwd"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
363 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
364 raise exceptions.DataError('parameter "passwd" is required')
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
365
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
366 opt_upload_images = options.get(self._blog_import.OPT_UPLOAD_IMAGES, None)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
367 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
368 media_repo = options["media_repo"]
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
369 if opt_upload_images:
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
370 options[self._blog_import.OPT_UPLOAD_IMAGES] = False # force using --no-images-upload
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
371 info_msg = _("DokuWiki media files will be *downloaded* to {temp_dir} - to finish the import you have to upload them *manually* to {media_repo}")
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
372 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
373 media_repo = DEFAULT_MEDIA_REPO
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
374 if opt_upload_images:
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
375 info_msg = _("DokuWiki media files will be *uploaded* to the XMPP server. Hyperlinks to these media may not been updated though.")
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
376 else:
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
377 info_msg = _("DokuWiki media files will *stay* on {location} - some of them may be protected by DokuWiki ACL and will not be accessible.")
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
378
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
379 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
380 namespace = options["namespace"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
381 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
382 namespace = DEFAULT_NAMESPACE
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
383 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
384 limit = options["limit"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
385 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
386 limit = DEFAULT_LIMIT
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
387
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
388 dk_importer = Importer(location, user, passwd, media_repo, limit)
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
389 info_msg = info_msg.format(temp_dir=dk_importer.temp_dir, media_repo=media_repo, location=location)
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
390 self.host.actionNew({'xmlui': xml_tools.note(info_msg).toXml()}, profile=client.profile)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
391 d = threads.deferToThread(dk_importer.process, client, namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
392 return d