annotate sat/plugins/plugin_blog_import_dokuwiki.py @ 3842:943901372eba

component AP gateway: `apGetObject` and `apGetList` now work with local object: When a local item is linked, it is found from database instead of doing an HTTP request. rel 370
author Goffi <goffi@goffi.org>
date Thu, 14 Jul 2022 12:55:30 +0200
parents be6d91572633
children 524856bd7b19
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
1 #!/usr/bin/env python3
3137
559a625a236b fixed shebangs
Goffi <goffi@goffi.org>
parents: 3136
diff changeset
2
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
3
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
4 # SàT plugin to import dokuwiki blogs
3479
be6d91572633 date update
Goffi <goffi@goffi.org>
parents: 3137
diff changeset
5 # Copyright (C) 2009-2021 Jérôme Poisson (goffi@goffi.org)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
6 # Copyright (C) 2013-2016 Adrien Cossa (souliane@mailoo.org)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
7
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
8 # This program is free software: you can redistribute it and/or modify
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
9 # it under the terms of the GNU Affero General Public License as published by
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
10 # the Free Software Foundation, either version 3 of the License, or
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
11 # (at your option) any later version.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
12
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
13 # This program is distributed in the hope that it will be useful,
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
14 # but WITHOUT ANY WARRANTY; without even the implied warranty of
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
15 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
16 # GNU Affero General Public License for more details.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
17
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
18 # You should have received a copy of the GNU Affero General Public License
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
19 # along with this program. If not, see <http://www.gnu.org/licenses/>.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
20
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
21 from sat.core.i18n import _, D_
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
22 from sat.core.constants import Const as C
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
23 from sat.core.log import getLogger
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
24
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
25 log = getLogger(__name__)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
26 from sat.core import exceptions
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
27 from sat.tools import xml_tools
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
28 from twisted.internet import threads
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
29 from collections import OrderedDict
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
30 import calendar
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
31 import urllib.request, urllib.parse, urllib.error
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
32 import urllib.parse
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
33 import tempfile
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
34 import re
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
35 import time
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
36 import os.path
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
37
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
38 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
39 from dokuwiki import DokuWiki, DokuWikiError # this is a new dependency
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
40 except ImportError:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
41 raise exceptions.MissingModule(
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
42 'Missing module dokuwiki, please install it with "pip install dokuwiki"'
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
43 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
44 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
45 from PIL import Image # this is already needed by plugin XEP-0054
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
46 except:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
47 raise exceptions.MissingModule(
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
48 "Missing module pillow, please download/install it from https://python-pillow.github.io"
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
49 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
50
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
51 PLUGIN_INFO = {
2145
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
52 C.PI_NAME: "Dokuwiki import",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
53 C.PI_IMPORT_NAME: "IMPORT_DOKUWIKI",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
54 C.PI_TYPE: C.PLUG_TYPE_BLOG,
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
55 C.PI_DEPENDENCIES: ["BLOG_IMPORT"],
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
56 C.PI_MAIN: "DokuwikiImport",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
57 C.PI_HANDLER: "no",
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
58 C.PI_DESCRIPTION: _("""Blog importer for Dokuwiki blog engine."""),
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
59 }
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
60
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
61 SHORT_DESC = D_("import posts from Dokuwiki blog engine")
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
62
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
63 LONG_DESC = D_(
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
64 """This importer handle Dokuwiki blog engine.
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
65
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
66 To use it, you need an admin access to a running Dokuwiki website
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
67 (local or on the Internet). The importer retrieves the data using
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
68 the XMLRPC Dokuwiki API.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
69
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
70 You can specify a namespace (that could be a namespace directory
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
71 or a single post) or leave it empty to use the root namespace "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
72 and import all the posts.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
73
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
74 You can specify a new media repository to modify the internal
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
75 media links and make them point to the URL of your choice, but
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
76 note that the upload is not done automatically: a temporary
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
77 directory will be created on your local drive and you will
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
78 need to upload it yourself to your repository via SSH or FTP.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
79
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
80 Following options are recognized:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
81
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
82 location: DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
83 user: DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
84 passwd: DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
85 namespace: DokuWiki namespace to import (default: root namespace "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
86 media_repo: URL to the new remote media repository (default: none)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
87 limit: maximal number of posts to import (default: 100)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
88
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
89 Example of usage (with jp frontend):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
90
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
91 jp import dokuwiki -p dave --pwd xxxxxx --connect
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
92 http://127.0.1.1 -o user souliane -o passwd qwertz
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
93 -o namespace public:2015:10
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
94 -o media_repo http://media.diekulturvermittlung.at
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
95
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
96 This retrieves the 100 last blog posts from http://127.0.1.1 that
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
97 are inside the namespace "public:2015:10" using the Dokuwiki user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
98 "souliane", and it imports them to sat profile dave's microblog node.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
99 Internal Dokuwiki media that were hosted on http://127.0.1.1 are now
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
100 pointing to http://media.diekulturvermittlung.at.
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
101 """
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
102 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
103 DEFAULT_MEDIA_REPO = ""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
104 DEFAULT_NAMESPACE = "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
105 DEFAULT_LIMIT = 100 # you might get a DBUS timeout (no reply) if it lasts too long
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
106
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
107
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
108 class Importer(DokuWiki):
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
109 def __init__(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
110 self, url, user, passwd, media_repo=DEFAULT_MEDIA_REPO, limit=DEFAULT_LIMIT
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
111 ):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
112 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
113
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
114 @param url (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
115 @param user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
116 @param passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
117 @param media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
118 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
119 DokuWiki.__init__(self, url, user, passwd)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
120 self.url = url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
121 self.media_repo = media_repo
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
122 self.temp_dir = tempfile.mkdtemp() if self.media_repo else None
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
123 self.limit = limit
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
124 self.posts_data = OrderedDict()
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
125
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
126 def getPostId(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
127 """Return a unique and constant post id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
128
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
129 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
130 @return (unicode): post unique item id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
131 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
132 return str(post["id"])
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
133
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
134 def getPostUpdated(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
135 """Return the update date.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
136
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
137 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
138 @return (unicode): update date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
139 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
140 return str(post["mtime"])
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
141
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
142 def getPostPublished(self, post):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
143 """Try to parse the date from the message ID, else use "mtime".
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
144
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
145 The date can be extracted if the message ID looks like one of:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
146 - namespace:YYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
147 - namespace:YYYYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
148 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
149 @return (unicode): publication date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
150 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
151 id_, default = str(post["id"]), str(post["mtime"])
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
152 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
153 date = id_.split(":")[-1].split("_")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
154 except KeyError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
155 return default
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
156 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
157 time_struct = time.strptime(date, "%y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
158 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
159 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
160 time_struct = time.strptime(date, "%Y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
161 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
162 return default
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
163 return str(calendar.timegm(time_struct))
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
164
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
165 def processPost(self, post, profile_jid):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
166 """Process a single page.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
167
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
168 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
169 @param profile_jid
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
170 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
171 # get main information
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
172 id_ = self.getPostId(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
173 updated = self.getPostUpdated(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
174 published = self.getPostPublished(post)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
175
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
176 # manage links
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
177 backlinks = self.pages.backlinks(id_)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
178 for link in self.pages.links(id_):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
179 if link["type"] != "extern":
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
180 assert link["type"] == "local"
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
181 page = link["page"]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
182 backlinks.append(page[1:] if page.startswith(":") else page)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
183
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
184 self.pages.get(id_)
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
185 content_xhtml = self.processContent(self.pages.html(id_), backlinks, profile_jid)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
186
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
187 # XXX: title is already in content_xhtml and difficult to remove, so leave it
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
188 # title = content.split("\n")[0].strip(u"\ufeff= ")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
189
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
190 # build the extra data dictionary
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
191 mb_data = {
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
192 "id": id_,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
193 "published": published,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
194 "updated": updated,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
195 "author": profile_jid.user,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
196 # "content": content, # when passed, it is displayed in Libervia instead of content_xhtml
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
197 "content_xhtml": content_xhtml,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
198 # "title": title,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
199 "allow_comments": "true",
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
200 }
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
201
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
202 # find out if the message access is public or restricted
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
203 namespace = id_.split(":")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
204 if namespace and namespace.lower() not in ("public", "/"):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
205 mb_data["group"] = namespace # roster group must exist
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
206
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
207 self.posts_data[id_] = {"blog": mb_data, "comments": [[]]}
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
208
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
209 def process(self, client, namespace=DEFAULT_NAMESPACE):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
210 """Process a namespace or a single page.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
211
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
212 @param namespace (unicode): DokuWiki namespace (or page) to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
213 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
214 profile_jid = client.jid
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
215 log.info("Importing data from DokuWiki %s" % self.version)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
216 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
217 pages_list = self.pages.list(namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
218 except DokuWikiError:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
219 log.warning(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
220 'Could not list Dokuwiki pages: please turn the "display_errors" setting to "Off" in the php.ini of the webserver hosting DokuWiki.'
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
221 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
222 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
223
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
224 if not pages_list: # namespace is actually a page?
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
225 names = namespace.split(":")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
226 real_namespace = ":".join(names[0:-1])
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
227 pages_list = self.pages.list(real_namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
228 pages_list = [page for page in pages_list if page["id"] == namespace]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
229 namespace = real_namespace
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
230
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
231 count = 0
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
232 for page in pages_list:
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
233 self.processPost(page, profile_jid)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
234 count += 1
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
235 if count >= self.limit:
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
236 break
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
237
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
238 return (iter(self.posts_data.values()), len(self.posts_data))
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
239
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
240 def processContent(self, text, backlinks, profile_jid):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
241 """Do text substitutions and file copy.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
242
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
243 @param text (unicode): message content
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
244 @param backlinks (list[unicode]): list of backlinks
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
245 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
246 text = text.strip("\ufeff") # this is at the beginning of the file (BOM)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
247
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
248 for backlink in backlinks:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
249 src = '/doku.php?id=%s"' % backlink
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
250 tgt = '/blog/%s/%s" target="#"' % (profile_jid.user, backlink)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
251 text = text.replace(src, tgt)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
252
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
253 subs = {}
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
254
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
255 link_pattern = r"""<(img|a)[^>]* (src|href)="([^"]+)"[^>]*>"""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
256 for tag in re.finditer(link_pattern, text):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
257 type_, attr, link = tag.group(1), tag.group(2), tag.group(3)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
258 assert (type_ == "img" and attr == "src") or (type_ == "a" and attr == "href")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
259 if re.match(r"^\w*://", link): # absolute URL to link directly
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
260 continue
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
261 if self.media_repo:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
262 self.moveMedia(link, subs)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
263 elif link not in subs:
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
264 subs[link] = urllib.parse.urljoin(self.url, link)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
265
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
266 for url, new_url in subs.items():
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
267 text = text.replace(url, new_url)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
268 return text
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
269
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
270 def moveMedia(self, link, subs):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
271 """Move a media from the DokuWiki host to the new repository.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
272
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
273 This also updates the hyperlinks to internal media files.
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
274 @param link (unicode): media link
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
275 @param subs (dict): substitutions data
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
276 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
277 url = urllib.parse.urljoin(self.url, link)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
278 user_media = re.match(r"(/lib/exe/\w+.php\?)(.*)", link)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
279 thumb_width = None
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
280
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
281 if user_media: # media that has been added by the user
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
282 params = urllib.parse.parse_qs(urllib.parse.urlparse(url).query)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
283 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
284 media = params["media"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
285 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
286 log.warning("No media found in fetch URL: %s" % user_media.group(2))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
287 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
288 if re.match(r"^\w*://", media): # external URL to link directly
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
289 subs[link] = media
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
290 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
291 try: # create thumbnail
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
292 thumb_width = params["w"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
293 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
294 pass
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
295
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
296 filename = media.replace(":", "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
297 # XXX: avoid "precondition failed" error (only keep the media parameter)
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
298 url = urllib.parse.urljoin(self.url, "/lib/exe/fetch.php?media=%s" % media)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
299
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
300 elif link.startswith("/lib/plugins/"):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
301 # other link added by a plugin or something else
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
302 filename = link[13:]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
303 else: # fake alert... there's no media (or we don't handle it yet)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
304 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
305
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
306 filepath = os.path.join(self.temp_dir, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
307 self.downloadMedia(url, filepath)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
308
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
309 if thumb_width:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
310 filename = os.path.join("thumbs", thumb_width, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
311 thumbnail = os.path.join(self.temp_dir, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
312 self.createThumbnail(filepath, thumbnail, thumb_width)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
313
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
314 new_url = os.path.join(self.media_repo, filename)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
315 subs[link] = new_url
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
316
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
317 def downloadMedia(self, source, dest):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
318 """Copy media to localhost.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
319
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
320 @param source (unicode): source url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
321 @param dest (unicode): target path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
322 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
323 dirname = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
324 if not os.path.exists(dest):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
325 if not os.path.exists(dirname):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
326 os.makedirs(dirname)
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
327 urllib.request.urlretrieve(source, dest)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
328 log.debug("DokuWiki media file copied to %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
329
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
330 def createThumbnail(self, source, dest, width):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
331 """Create a thumbnail.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
332
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
333 @param source (unicode): source file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
334 @param dest (unicode): destination file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
335 @param width (unicode): thumbnail's width
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
336 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
337 thumb_dir = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
338 if not os.path.exists(thumb_dir):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
339 os.makedirs(thumb_dir)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
340 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
341 im = Image.open(source)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
342 im.thumbnail((width, int(width) * im.size[0] / im.size[1]))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
343 im.save(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
344 log.debug("DokuWiki media thumbnail created: %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
345 except IOError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
346 log.error("Cannot create DokuWiki media thumbnail %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
347
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
348
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
349 class DokuwikiImport(object):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
350 def __init__(self, host):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
351 log.info(_("plugin Dokuwiki Import initialization"))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
352 self.host = host
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
353 self._blog_import = host.plugins["BLOG_IMPORT"]
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
354 self._blog_import.register("dokuwiki", self.DkImport, SHORT_DESC, LONG_DESC)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
355
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
356 def DkImport(self, client, location, options=None):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
357 """Import from DokuWiki to PubSub
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
358
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
359 @param location (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
360 @param options (dict, None): DokuWiki import parameters
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
361 - user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
362 - passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
363 - namespace (unicode): DokuWiki namespace to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
364 - media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
365 """
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
366 options[self._blog_import.OPT_HOST] = location
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
367 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
368 user = options["user"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
369 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
370 raise exceptions.DataError('parameter "user" is required')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
371 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
372 passwd = options["passwd"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
373 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
374 raise exceptions.DataError('parameter "passwd" is required')
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
375
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
376 opt_upload_images = options.get(self._blog_import.OPT_UPLOAD_IMAGES, None)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
377 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
378 media_repo = options["media_repo"]
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
379 if opt_upload_images:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
380 options[
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
381 self._blog_import.OPT_UPLOAD_IMAGES
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
382 ] = False # force using --no-images-upload
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
383 info_msg = _(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
384 "DokuWiki media files will be *downloaded* to {temp_dir} - to finish the import you have to upload them *manually* to {media_repo}"
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
385 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
386 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
387 media_repo = DEFAULT_MEDIA_REPO
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
388 if opt_upload_images:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
389 info_msg = _(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
390 "DokuWiki media files will be *uploaded* to the XMPP server. Hyperlinks to these media may not been updated though."
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
391 )
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
392 else:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
393 info_msg = _(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
394 "DokuWiki media files will *stay* on {location} - some of them may be protected by DokuWiki ACL and will not be accessible."
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
395 )
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
396
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
397 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
398 namespace = options["namespace"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
399 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
400 namespace = DEFAULT_NAMESPACE
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
401 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
402 limit = options["limit"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
403 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
404 limit = DEFAULT_LIMIT
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
405
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
406 dk_importer = Importer(location, user, passwd, media_repo, limit)
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
407 info_msg = info_msg.format(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
408 temp_dir=dk_importer.temp_dir, media_repo=media_repo, location=location
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
409 )
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
410 self.host.actionNew(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
411 {"xmlui": xml_tools.note(info_msg).toXml()}, profile=client.profile
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
412 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
413 d = threads.deferToThread(dk_importer.process, client, namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
414 return d