annotate src/plugins/plugin_blog_import_dokuwiki.py @ 2307:8fa7edd0da24

plugin Pubsub Hook: first draft: This new plugin allow to attach an external action to a Pubsub event (i.e. notification). Hook can be persitent accross restarts, or temporary (will be deleted on profile disconnection). Only Python files are handled for now. In the future, it may make sense to move hooks in a generic plugin which could be used by ad-hoc commands, messages, pubsub, etc.
author Goffi <goffi@goffi.org>
date Wed, 05 Jul 2017 15:05:47 +0200
parents 33c8c4973743
children 8b37a62336c3
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1934
2daf7b4c6756 use of /usr/bin/env instead of /usr/bin/python in shebang
Goffi <goffi@goffi.org>
parents: 1853
diff changeset
1 #!/usr/bin/env python2
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
2 # -*- coding: utf-8 -*-
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
3
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
4 # SàT plugin to import dokuwiki blogs
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
5 # Copyright (C) 2009-2016 Jérôme Poisson (goffi@goffi.org)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
6 # Copyright (C) 2013-2016 Adrien Cossa (souliane@mailoo.org)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
7
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
8 # This program is free software: you can redistribute it and/or modify
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
9 # it under the terms of the GNU Affero General Public License as published by
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
10 # the Free Software Foundation, either version 3 of the License, or
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
11 # (at your option) any later version.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
12
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
13 # This program is distributed in the hope that it will be useful,
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
14 # but WITHOUT ANY WARRANTY; without even the implied warranty of
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
15 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
16 # GNU Affero General Public License for more details.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
17
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
18 # You should have received a copy of the GNU Affero General Public License
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
19 # along with this program. If not, see <http://www.gnu.org/licenses/>.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
20
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
21 from sat.core.i18n import _, D_
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
22 from sat.core.constants import Const as C
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
23 from sat.core.log import getLogger
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
24 log = getLogger(__name__)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
25 from sat.core import exceptions
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
26 from sat.tools import xml_tools
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
27 from twisted.internet import threads
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
28 from collections import OrderedDict
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
29 import calendar
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
30 import urllib
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
31 import urlparse
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
32 import tempfile
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
33 import re
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
34 import time
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
35 import os.path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
36 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
37 from dokuwiki import DokuWiki, DokuWikiError # this is a new dependency
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
38 except ImportError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
39 raise exceptions.MissingModule(u'Missing module dokuwiki, please install it with "pip install dokuwiki"')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
40 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
41 from PIL import Image # this is already needed by plugin XEP-0054
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
42 except:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
43 raise exceptions.MissingModule(u"Missing module pillow, please download/install it from https://python-pillow.github.io")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
44
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
45 PLUGIN_INFO = {
2145
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
46 C.PI_NAME: "Dokuwiki import",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
47 C.PI_IMPORT_NAME: "IMPORT_DOKUWIKI",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
48 C.PI_TYPE: C.PLUG_TYPE_BLOG,
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
49 C.PI_DEPENDENCIES: ["BLOG_IMPORT"],
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
50 C.PI_MAIN: "DokuwikiImport",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
51 C.PI_HANDLER: "no",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
52 C.PI_DESCRIPTION: _("""Blog importer for Dokuwiki blog engine.""")
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
53 }
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
54
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
55 SHORT_DESC = D_(u"import posts from Dokuwiki blog engine")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
56
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
57 LONG_DESC = D_(u"""This importer handle Dokuwiki blog engine.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
58
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
59 To use it, you need an admin access to a running Dokuwiki website
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
60 (local or on the Internet). The importer retrieves the data using
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
61 the XMLRPC Dokuwiki API.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
62
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
63 You can specify a namespace (that could be a namespace directory
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
64 or a single post) or leave it empty to use the root namespace "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
65 and import all the posts.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
66
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
67 You can specify a new media repository to modify the internal
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
68 media links and make them point to the URL of your choice, but
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
69 note that the upload is not done automatically: a temporary
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
70 directory will be created on your local drive and you will
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
71 need to upload it yourself to your repository via SSH or FTP.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
72
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
73 Following options are recognized:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
74
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
75 location: DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
76 user: DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
77 passwd: DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
78 namespace: DokuWiki namespace to import (default: root namespace "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
79 media_repo: URL to the new remote media repository (default: none)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
80 limit: maximal number of posts to import (default: 100)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
81
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
82 Example of usage (with jp frontend):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
83
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
84 jp import dokuwiki -p dave --pwd xxxxxx --connect
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
85 http://127.0.1.1 -o user souliane -o passwd qwertz
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
86 -o namespace public:2015:10
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
87 -o media_repo http://media.diekulturvermittlung.at
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
88
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
89 This retrieves the 100 last blog posts from http://127.0.1.1 that
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
90 are inside the namespace "public:2015:10" using the Dokuwiki user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
91 "souliane", and it imports them to sat profile dave's microblog node.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
92 Internal Dokuwiki media that were hosted on http://127.0.1.1 are now
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
93 pointing to http://media.diekulturvermittlung.at.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
94 """)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
95 DEFAULT_MEDIA_REPO = ""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
96 DEFAULT_NAMESPACE = "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
97 DEFAULT_LIMIT = 100 # you might get a DBUS timeout (no reply) if it lasts too long
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
98
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
99
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
100 class Importer(DokuWiki):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
101
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
102 def __init__(self, url, user, passwd, media_repo=DEFAULT_MEDIA_REPO, limit=DEFAULT_LIMIT):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
103 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
104
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
105 @param url (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
106 @param user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
107 @param passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
108 @param media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
109 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
110 DokuWiki.__init__(self, url, user, passwd)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
111 self.url = url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
112 self.media_repo = media_repo
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
113 self.temp_dir = tempfile.mkdtemp() if self.media_repo else None
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
114 self.limit = limit
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
115 self.posts_data = OrderedDict()
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
116
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
117 def getPostId(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
118 """Return a unique and constant post id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
119
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
120 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
121 @return (unicode): post unique item id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
122 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
123 return unicode(post['id'])
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
124
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
125 def getPostUpdated(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
126 """Return the update date.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
127
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
128 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
129 @return (unicode): update date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
130 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
131 return unicode(post['mtime'])
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
132
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
133 def getPostPublished(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
134 """Try to parse the date from the message ID, else use "mtime".
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
135
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
136 The date can be extracted if the message ID looks like one of:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
137 - namespace:YYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
138 - namespace:YYYYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
139 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
140 @return (unicode): publication date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
141 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
142 id_, default = unicode(post["id"]), unicode(post["mtime"])
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
143 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
144 date = id_.split(":")[-1].split("_")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
145 except KeyError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
146 return default
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
147 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
148 time_struct = time.strptime(date, "%y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
149 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
150 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
151 time_struct = time.strptime(date, "%Y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
152 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
153 return default
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
154 return unicode(calendar.timegm(time_struct))
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
155
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
156 def processPost(self, post, profile_jid):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
157 """Process a single page.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
158
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
159 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
160 @param profile_jid
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
161 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
162 # get main information
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
163 id_ = self.getPostId(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
164 updated = self.getPostUpdated(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
165 published = self.getPostPublished(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
166
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
167 # manage links
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
168 backlinks = self.pages.backlinks(id_)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
169 for link in self.pages.links(id_):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
170 if link["type"] != "extern":
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
171 assert link["type"] == "local"
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
172 page = link["page"]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
173 backlinks.append(page[1:] if page.startswith(":") else page)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
174
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
175 self.pages.get(id_)
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
176 content_xhtml = self.processContent(self.pages.html(id_), backlinks, profile_jid)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
177
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
178 # XXX: title is already in content_xhtml and difficult to remove, so leave it
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
179 # title = content.split("\n")[0].strip(u"\ufeff= ")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
180
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
181 # build the extra data dictionary
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
182 mb_data = {"id": id_,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
183 "published": published,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
184 "updated": updated,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
185 "author": profile_jid.user,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
186 # "content": content, # when passed, it is displayed in Libervia instead of content_xhtml
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
187 "content_xhtml": content_xhtml,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
188 # "title": title,
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
189 "allow_comments": "true",
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
190 }
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
191
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
192 # find out if the message access is public or restricted
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
193 namespace = id_.split(":")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
194 if namespace and namespace.lower() not in ("public", "/"):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
195 mb_data["group"] = namespace # roster group must exist
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
196
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
197 self.posts_data[id_] = {'blog': mb_data, 'comments':[[]]}
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
198
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
199 def process(self, client, namespace=DEFAULT_NAMESPACE):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
200 """Process a namespace or a single page.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
201
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
202 @param namespace (unicode): DokuWiki namespace (or page) to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
203 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
204 profile_jid = client.jid
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
205 log.info("Importing data from DokuWiki %s" % self.version)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
206 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
207 pages_list = self.pages.list(namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
208 except DokuWikiError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
209 log.warning('Could not list Dokuwiki pages: please turn the "display_errors" setting to "Off" in the php.ini of the webserver hosting DokuWiki.')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
210 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
211
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
212 if not pages_list: # namespace is actually a page?
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
213 names = namespace.split(":")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
214 real_namespace = ":".join(names[0:-1])
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
215 pages_list = self.pages.list(real_namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
216 pages_list = [page for page in pages_list if page["id"] == namespace]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
217 namespace = real_namespace
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
218
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
219 count = 0
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
220 for page in pages_list:
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
221 self.processPost(page, profile_jid)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
222 count += 1
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
223 if count >= self.limit :
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
224 break
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
225
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
226 return (self.posts_data.itervalues(), len(self.posts_data))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
227
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
228 def processContent(self, text, backlinks, profile_jid):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
229 """Do text substitutions and file copy.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
230
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
231 @param text (unicode): message content
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
232 @param backlinks (list[unicode]): list of backlinks
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
233 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
234 text = text.strip(u"\ufeff") # this is at the beginning of the file (BOM)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
235
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
236 for backlink in backlinks:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
237 src = '/doku.php?id=%s"' % backlink
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
238 tgt = '/blog/%s/%s" target="#"' % (profile_jid.user, backlink)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
239 text = text.replace(src, tgt)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
240
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
241 subs = {}
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
242
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
243 link_pattern = r"""<(img|a)[^>]* (src|href)="([^"]+)"[^>]*>"""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
244 for tag in re.finditer(link_pattern, text):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
245 type_, attr, link = tag.group(1), tag.group(2), tag.group(3)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
246 assert (type_ == "img" and attr == "src") or (type_ == "a" and attr == "href")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
247 if re.match(r"^\w*://", link): # absolute URL to link directly
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
248 continue
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
249 if self.media_repo:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
250 self.moveMedia(link, subs)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
251 elif link not in subs:
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
252 subs[link] = urlparse.urljoin(self.url, link)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
253
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
254 for url, new_url in subs.iteritems():
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
255 text = text.replace(url, new_url)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
256 return text
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
257
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
258 def moveMedia(self, link, subs):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
259 """Move a media from the DokuWiki host to the new repository.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
260
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
261 This also updates the hyperlinks to internal media files.
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
262 @param link (unicode): media link
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
263 @param subs (dict): substitutions data
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
264 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
265 url = urlparse.urljoin(self.url, link)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
266 user_media = re.match(r"(/lib/exe/\w+.php\?)(.*)", link)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
267 thumb_width = None
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
268
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
269 if user_media: # media that has been added by the user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
270 params = urlparse.parse_qs(urlparse.urlparse(url).query)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
271 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
272 media = params["media"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
273 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
274 log.warning("No media found in fetch URL: %s" % user_media.group(2))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
275 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
276 if re.match(r"^\w*://", media): # external URL to link directly
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
277 subs[link] = media
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
278 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
279 try: # create thumbnail
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
280 thumb_width = params["w"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
281 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
282 pass
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
283
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
284 filename = media.replace(":", "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
285 # XXX: avoid "precondition failed" error (only keep the media parameter)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
286 url = urlparse.urljoin(self.url, "/lib/exe/fetch.php?media=%s" % media)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
287
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
288 elif link.startswith("/lib/plugins/"):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
289 # other link added by a plugin or something else
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
290 filename = link[13:]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
291 else: # fake alert... there's no media (or we don't handle it yet)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
292 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
293
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
294 filepath = os.path.join(self.temp_dir, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
295 self.downloadMedia(url, filepath)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
296
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
297 if thumb_width:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
298 filename = os.path.join("thumbs", thumb_width, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
299 thumbnail = os.path.join(self.temp_dir, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
300 self.createThumbnail(filepath, thumbnail, thumb_width)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
301
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
302 new_url = os.path.join(self.media_repo, filename)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
303 subs[link] = new_url
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
304
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
305 def downloadMedia(self, source, dest):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
306 """Copy media to localhost.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
307
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
308 @param source (unicode): source url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
309 @param dest (unicode): target path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
310 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
311 dirname = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
312 if not os.path.exists(dest):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
313 if not os.path.exists(dirname):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
314 os.makedirs(dirname)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
315 urllib.urlretrieve(source, dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
316 log.debug("DokuWiki media file copied to %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
317
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
318 def createThumbnail(self, source, dest, width):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
319 """Create a thumbnail.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
320
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
321 @param source (unicode): source file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
322 @param dest (unicode): destination file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
323 @param width (unicode): thumbnail's width
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
324 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
325 thumb_dir = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
326 if not os.path.exists(thumb_dir):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
327 os.makedirs(thumb_dir)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
328 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
329 im = Image.open(source)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
330 im.thumbnail((width, int(width) * im.size[0] / im.size[1]))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
331 im.save(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
332 log.debug("DokuWiki media thumbnail created: %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
333 except IOError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
334 log.error("Cannot create DokuWiki media thumbnail %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
335
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
336
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
337
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
338 class DokuwikiImport(object):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
339
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
340 def __init__(self, host):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
341 log.info(_("plugin Dokuwiki Import initialization"))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
342 self.host = host
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
343 self._blog_import = host.plugins['BLOG_IMPORT']
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
344 self._blog_import.register('dokuwiki', self.DkImport, SHORT_DESC, LONG_DESC)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
345
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
346 def DkImport(self, client, location, options=None):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
347 """Import from DokuWiki to PubSub
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
348
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
349 @param location (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
350 @param options (dict, None): DokuWiki import parameters
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
351 - user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
352 - passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
353 - namespace (unicode): DokuWiki namespace to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
354 - media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
355 """
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
356 options[self._blog_import.OPT_HOST] = location
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
357 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
358 user = options["user"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
359 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
360 raise exceptions.DataError('parameter "user" is required')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
361 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
362 passwd = options["passwd"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
363 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
364 raise exceptions.DataError('parameter "passwd" is required')
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
365
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
366 opt_upload_images = options.get(self._blog_import.OPT_UPLOAD_IMAGES, None)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
367 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
368 media_repo = options["media_repo"]
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
369 if opt_upload_images:
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
370 options[self._blog_import.OPT_UPLOAD_IMAGES] = False # force using --no-images-upload
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
371 info_msg = _("DokuWiki media files will be *downloaded* to {temp_dir} - to finish the import you have to upload them *manually* to {media_repo}")
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
372 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
373 media_repo = DEFAULT_MEDIA_REPO
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
374 if opt_upload_images:
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
375 info_msg = _("DokuWiki media files will be *uploaded* to the XMPP server. Hyperlinks to these media may not been updated though.")
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
376 else:
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
377 info_msg = _("DokuWiki media files will *stay* on {location} - some of them may be protected by DokuWiki ACL and will not be accessible.")
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
378
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
379 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
380 namespace = options["namespace"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
381 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
382 namespace = DEFAULT_NAMESPACE
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
383 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
384 limit = options["limit"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
385 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
386 limit = DEFAULT_LIMIT
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
387
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
388 dk_importer = Importer(location, user, passwd, media_repo, limit)
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
389 info_msg = info_msg.format(temp_dir=dk_importer.temp_dir, media_repo=media_repo, location=location)
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
390 self.host.actionNew({'xmlui': xml_tools.note(info_msg).toXml()}, profile=client.profile)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
391 d = threads.deferToThread(dk_importer.process, client, namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
392 return d