annotate libervia/backend/plugins/plugin_blog_import_dokuwiki.py @ 4095:684ba556a617

core (memory/sqla_mapping): fix legacy pickled values: folloing packages refactoring, legacy pickled values could not be unpickled (due to use of old classes). This temporary workaround fix it, but the right thing to do will be to move from pickle to JSON at some point.
author Goffi <goffi@goffi.org>
date Mon, 12 Jun 2023 14:57:27 +0200
parents 47401850dec6
children 0d7bb4df2343
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
1 #!/usr/bin/env python3
3137
559a625a236b fixed shebangs
Goffi <goffi@goffi.org>
parents: 3136
diff changeset
2
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
3
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
4 # SàT plugin to import dokuwiki blogs
3479
be6d91572633 date update
Goffi <goffi@goffi.org>
parents: 3137
diff changeset
5 # Copyright (C) 2009-2021 Jérôme Poisson (goffi@goffi.org)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
6 # Copyright (C) 2013-2016 Adrien Cossa (souliane@mailoo.org)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
7
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
8 # This program is free software: you can redistribute it and/or modify
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
9 # it under the terms of the GNU Affero General Public License as published by
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
10 # the Free Software Foundation, either version 3 of the License, or
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
11 # (at your option) any later version.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
12
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
13 # This program is distributed in the hope that it will be useful,
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
14 # but WITHOUT ANY WARRANTY; without even the implied warranty of
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
15 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
16 # GNU Affero General Public License for more details.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
17
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
18 # You should have received a copy of the GNU Affero General Public License
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
19 # along with this program. If not, see <http://www.gnu.org/licenses/>.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
20
4071
4b842c1fb686 refactoring: renamed `sat` package to `libervia.backend`
Goffi <goffi@goffi.org>
parents: 4037
diff changeset
21 from libervia.backend.core.i18n import _, D_
4b842c1fb686 refactoring: renamed `sat` package to `libervia.backend`
Goffi <goffi@goffi.org>
parents: 4037
diff changeset
22 from libervia.backend.core.constants import Const as C
4b842c1fb686 refactoring: renamed `sat` package to `libervia.backend`
Goffi <goffi@goffi.org>
parents: 4037
diff changeset
23 from libervia.backend.core.log import getLogger
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
24
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
25 log = getLogger(__name__)
4071
4b842c1fb686 refactoring: renamed `sat` package to `libervia.backend`
Goffi <goffi@goffi.org>
parents: 4037
diff changeset
26 from libervia.backend.core import exceptions
4b842c1fb686 refactoring: renamed `sat` package to `libervia.backend`
Goffi <goffi@goffi.org>
parents: 4037
diff changeset
27 from libervia.backend.tools import xml_tools
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
28 from twisted.internet import threads
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
29 from collections import OrderedDict
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
30 import calendar
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
31 import urllib.request, urllib.parse, urllib.error
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
32 import urllib.parse
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
33 import tempfile
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
34 import re
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
35 import time
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
36 import os.path
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
37
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
38 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
39 from dokuwiki import DokuWiki, DokuWikiError # this is a new dependency
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
40 except ImportError:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
41 raise exceptions.MissingModule(
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
42 'Missing module dokuwiki, please install it with "pip install dokuwiki"'
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
43 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
44 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
45 from PIL import Image # this is already needed by plugin XEP-0054
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
46 except:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
47 raise exceptions.MissingModule(
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
48 "Missing module pillow, please download/install it from https://python-pillow.github.io"
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
49 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
50
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
51 PLUGIN_INFO = {
2145
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
52 C.PI_NAME: "Dokuwiki import",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
53 C.PI_IMPORT_NAME: "IMPORT_DOKUWIKI",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
54 C.PI_TYPE: C.PLUG_TYPE_BLOG,
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
55 C.PI_DEPENDENCIES: ["BLOG_IMPORT"],
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
56 C.PI_MAIN: "DokuwikiImport",
33c8c4973743 core (plugins): added missing contants + use of new constants in PLUGIN_INFO
Goffi <goffi@goffi.org>
parents: 1934
diff changeset
57 C.PI_HANDLER: "no",
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
58 C.PI_DESCRIPTION: _("""Blog importer for Dokuwiki blog engine."""),
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
59 }
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
60
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
61 SHORT_DESC = D_("import posts from Dokuwiki blog engine")
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
62
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
63 LONG_DESC = D_(
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
64 """This importer handle Dokuwiki blog engine.
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
65
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
66 To use it, you need an admin access to a running Dokuwiki website
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
67 (local or on the Internet). The importer retrieves the data using
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
68 the XMLRPC Dokuwiki API.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
69
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
70 You can specify a namespace (that could be a namespace directory
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
71 or a single post) or leave it empty to use the root namespace "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
72 and import all the posts.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
73
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
74 You can specify a new media repository to modify the internal
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
75 media links and make them point to the URL of your choice, but
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
76 note that the upload is not done automatically: a temporary
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
77 directory will be created on your local drive and you will
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
78 need to upload it yourself to your repository via SSH or FTP.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
79
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
80 Following options are recognized:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
81
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
82 location: DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
83 user: DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
84 passwd: DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
85 namespace: DokuWiki namespace to import (default: root namespace "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
86 media_repo: URL to the new remote media repository (default: none)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
87 limit: maximal number of posts to import (default: 100)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
88
4075
47401850dec6 refactoring: rename `libervia.frontends.jp` to `libervia.cli`
Goffi <goffi@goffi.org>
parents: 4071
diff changeset
89 Example of usage (with CLI frontend):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
90
4075
47401850dec6 refactoring: rename `libervia.frontends.jp` to `libervia.cli`
Goffi <goffi@goffi.org>
parents: 4071
diff changeset
91 li import dokuwiki -p dave --pwd xxxxxx --connect
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
92 http://127.0.1.1 -o user souliane -o passwd qwertz
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
93 -o namespace public:2015:10
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
94 -o media_repo http://media.diekulturvermittlung.at
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
95
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
96 This retrieves the 100 last blog posts from http://127.0.1.1 that
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
97 are inside the namespace "public:2015:10" using the Dokuwiki user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
98 "souliane", and it imports them to sat profile dave's microblog node.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
99 Internal Dokuwiki media that were hosted on http://127.0.1.1 are now
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
100 pointing to http://media.diekulturvermittlung.at.
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
101 """
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
102 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
103 DEFAULT_MEDIA_REPO = ""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
104 DEFAULT_NAMESPACE = "/"
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
105 DEFAULT_LIMIT = 100 # you might get a DBUS timeout (no reply) if it lasts too long
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
106
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
107
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
108 class Importer(DokuWiki):
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
109 def __init__(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
110 self, url, user, passwd, media_repo=DEFAULT_MEDIA_REPO, limit=DEFAULT_LIMIT
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
111 ):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
112 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
113
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
114 @param url (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
115 @param user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
116 @param passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
117 @param media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
118 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
119 DokuWiki.__init__(self, url, user, passwd)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
120 self.url = url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
121 self.media_repo = media_repo
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
122 self.temp_dir = tempfile.mkdtemp() if self.media_repo else None
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
123 self.limit = limit
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
124 self.posts_data = OrderedDict()
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
125
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
126 def get_post_id(self, post):
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
127 """Return a unique and constant post id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
128
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
129 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
130 @return (unicode): post unique item id
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
131 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
132 return str(post["id"])
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
133
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
134 def get_post_updated(self, post):
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
135 """Return the update date.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
136
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
137 @param post(dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
138 @return (unicode): update date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
139 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
140 return str(post["mtime"])
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
141
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
142 def get_post_published(self, post):
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
143 """Try to parse the date from the message ID, else use "mtime".
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
144
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
145 The date can be extracted if the message ID looks like one of:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
146 - namespace:YYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
147 - namespace:YYYYMMDD_short_title
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
148 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
149 @return (unicode): publication date
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
150 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
151 id_, default = str(post["id"]), str(post["mtime"])
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
152 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
153 date = id_.split(":")[-1].split("_")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
154 except KeyError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
155 return default
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
156 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
157 time_struct = time.strptime(date, "%y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
158 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
159 try:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
160 time_struct = time.strptime(date, "%Y%m%d")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
161 except ValueError:
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
162 return default
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
163 return str(calendar.timegm(time_struct))
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
164
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
165 def process_post(self, post, profile_jid):
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
166 """Process a single page.
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
167
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
168 @param post (dict): parsed post data
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
169 @param profile_jid
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
170 """
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
171 # get main information
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
172 id_ = self.get_post_id(post)
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
173 updated = self.get_post_updated(post)
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
174 published = self.get_post_published(post)
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
175
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
176 # manage links
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
177 backlinks = self.pages.backlinks(id_)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
178 for link in self.pages.links(id_):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
179 if link["type"] != "extern":
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
180 assert link["type"] == "local"
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
181 page = link["page"]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
182 backlinks.append(page[1:] if page.startswith(":") else page)
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
183
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
184 self.pages.get(id_)
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
185 content_xhtml = self.process_content(self.pages.html(id_), backlinks, profile_jid)
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
186
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
187 # XXX: title is already in content_xhtml and difficult to remove, so leave it
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
188 # title = content.split("\n")[0].strip(u"\ufeff= ")
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
189
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
190 # build the extra data dictionary
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
191 mb_data = {
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
192 "id": id_,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
193 "published": published,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
194 "updated": updated,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
195 "author": profile_jid.user,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
196 # "content": content, # when passed, it is displayed in Libervia instead of content_xhtml
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
197 "content_xhtml": content_xhtml,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
198 # "title": title,
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
199 "allow_comments": "true",
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
200 }
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
201
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
202 # find out if the message access is public or restricted
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
203 namespace = id_.split(":")[0]
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
204 if namespace and namespace.lower() not in ("public", "/"):
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
205 mb_data["group"] = namespace # roster group must exist
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
206
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
207 self.posts_data[id_] = {"blog": mb_data, "comments": [[]]}
1842
9fd517248dc8 plugin blog_import_dokuwiki: refactor to make it look more similar to blog_import_dotclear
souliane <souliane@mailoo.org>
parents: 1841
diff changeset
208
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
209 def process(self, client, namespace=DEFAULT_NAMESPACE):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
210 """Process a namespace or a single page.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
211
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
212 @param namespace (unicode): DokuWiki namespace (or page) to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
213 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
214 profile_jid = client.jid
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
215 log.info("Importing data from DokuWiki %s" % self.version)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
216 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
217 pages_list = self.pages.list(namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
218 except DokuWikiError:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
219 log.warning(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
220 'Could not list Dokuwiki pages: please turn the "display_errors" setting to "Off" in the php.ini of the webserver hosting DokuWiki.'
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
221 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
222 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
223
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
224 if not pages_list: # namespace is actually a page?
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
225 names = namespace.split(":")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
226 real_namespace = ":".join(names[0:-1])
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
227 pages_list = self.pages.list(real_namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
228 pages_list = [page for page in pages_list if page["id"] == namespace]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
229 namespace = real_namespace
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
230
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
231 count = 0
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
232 for page in pages_list:
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
233 self.process_post(page, profile_jid)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
234 count += 1
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
235 if count >= self.limit:
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
236 break
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
237
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
238 return (iter(self.posts_data.values()), len(self.posts_data))
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
239
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
240 def process_content(self, text, backlinks, profile_jid):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
241 """Do text substitutions and file copy.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
242
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
243 @param text (unicode): message content
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
244 @param backlinks (list[unicode]): list of backlinks
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
245 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
246 text = text.strip("\ufeff") # this is at the beginning of the file (BOM)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
247
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
248 for backlink in backlinks:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
249 src = '/doku.php?id=%s"' % backlink
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
250 tgt = '/blog/%s/%s" target="#"' % (profile_jid.user, backlink)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
251 text = text.replace(src, tgt)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
252
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
253 subs = {}
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
254
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
255 link_pattern = r"""<(img|a)[^>]* (src|href)="([^"]+)"[^>]*>"""
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
256 for tag in re.finditer(link_pattern, text):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
257 type_, attr, link = tag.group(1), tag.group(2), tag.group(3)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
258 assert (type_ == "img" and attr == "src") or (type_ == "a" and attr == "href")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
259 if re.match(r"^\w*://", link): # absolute URL to link directly
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
260 continue
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
261 if self.media_repo:
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
262 self.move_media(link, subs)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
263 elif link not in subs:
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
264 subs[link] = urllib.parse.urljoin(self.url, link)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
265
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
266 for url, new_url in subs.items():
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
267 text = text.replace(url, new_url)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
268 return text
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
269
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
270 def move_media(self, link, subs):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
271 """Move a media from the DokuWiki host to the new repository.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
272
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
273 This also updates the hyperlinks to internal media files.
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
274 @param link (unicode): media link
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
275 @param subs (dict): substitutions data
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
276 """
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
277 url = urllib.parse.urljoin(self.url, link)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
278 user_media = re.match(r"(/lib/exe/\w+.php\?)(.*)", link)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
279 thumb_width = None
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
280
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
281 if user_media: # media that has been added by the user
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
282 params = urllib.parse.parse_qs(urllib.parse.urlparse(url).query)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
283 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
284 media = params["media"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
285 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
286 log.warning("No media found in fetch URL: %s" % user_media.group(2))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
287 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
288 if re.match(r"^\w*://", media): # external URL to link directly
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
289 subs[link] = media
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
290 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
291 try: # create thumbnail
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
292 thumb_width = params["w"][0]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
293 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
294 pass
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
295
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
296 filename = media.replace(":", "/")
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
297 # XXX: avoid "precondition failed" error (only keep the media parameter)
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
298 url = urllib.parse.urljoin(self.url, "/lib/exe/fetch.php?media=%s" % media)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
299
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
300 elif link.startswith("/lib/plugins/"):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
301 # other link added by a plugin or something else
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
302 filename = link[13:]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
303 else: # fake alert... there's no media (or we don't handle it yet)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
304 return
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
305
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
306 filepath = os.path.join(self.temp_dir, filename)
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
307 self.download_media(url, filepath)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
308
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
309 if thumb_width:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
310 filename = os.path.join("thumbs", thumb_width, filename)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
311 thumbnail = os.path.join(self.temp_dir, filename)
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
312 self.create_thumbnail(filepath, thumbnail, thumb_width)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
313
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
314 new_url = os.path.join(self.media_repo, filename)
1843
a51355982f11 plugin blog_import_dokuwiki: fixes wrong URL when a substitution occurs twice
souliane <souliane@mailoo.org>
parents: 1842
diff changeset
315 subs[link] = new_url
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
316
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
317 def download_media(self, source, dest):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
318 """Copy media to localhost.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
319
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
320 @param source (unicode): source url
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
321 @param dest (unicode): target path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
322 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
323 dirname = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
324 if not os.path.exists(dest):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
325 if not os.path.exists(dirname):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
326 os.makedirs(dirname)
3028
ab2696e34d29 Python 3 port:
Goffi <goffi@goffi.org>
parents: 2771
diff changeset
327 urllib.request.urlretrieve(source, dest)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
328 log.debug("DokuWiki media file copied to %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
329
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
330 def create_thumbnail(self, source, dest, width):
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
331 """Create a thumbnail.
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
332
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
333 @param source (unicode): source file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
334 @param dest (unicode): destination file path
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
335 @param width (unicode): thumbnail's width
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
336 """
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
337 thumb_dir = os.path.dirname(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
338 if not os.path.exists(thumb_dir):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
339 os.makedirs(thumb_dir)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
340 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
341 im = Image.open(source)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
342 im.thumbnail((width, int(width) * im.size[0] / im.size[1]))
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
343 im.save(dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
344 log.debug("DokuWiki media thumbnail created: %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
345 except IOError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
346 log.error("Cannot create DokuWiki media thumbnail %s" % dest)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
347
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
348
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
349 class DokuwikiImport(object):
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
350 def __init__(self, host):
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
351 log.info(_("plugin Dokuwiki import initialization"))
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
352 self.host = host
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
353 self._blog_import = host.plugins["BLOG_IMPORT"]
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
354 self._blog_import.register("dokuwiki", self.dk_import, SHORT_DESC, LONG_DESC)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
355
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
356 def dk_import(self, client, location, options=None):
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
357 """import from DokuWiki to PubSub
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
358
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
359 @param location (unicode): DokuWiki site URL
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
360 @param options (dict, None): DokuWiki import parameters
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
361 - user (unicode): DokuWiki admin user
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
362 - passwd (unicode): DokuWiki admin password
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
363 - namespace (unicode): DokuWiki namespace to import
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
364 - media_repo (unicode): New remote media repository
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
365 """
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
366 options[self._blog_import.OPT_HOST] = location
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
367 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
368 user = options["user"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
369 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
370 raise exceptions.DataError('parameter "user" is required')
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
371 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
372 passwd = options["passwd"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
373 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
374 raise exceptions.DataError('parameter "passwd" is required')
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
375
1853
1a9c12644552 plugin blog import dokuwiki: fixed bad use of MissingModule and unmodified docstring
Goffi <goffi@goffi.org>
parents: 1844
diff changeset
376 opt_upload_images = options.get(self._blog_import.OPT_UPLOAD_IMAGES, None)
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
377 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
378 media_repo = options["media_repo"]
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
379 if opt_upload_images:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
380 options[
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
381 self._blog_import.OPT_UPLOAD_IMAGES
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
382 ] = False # force using --no-images-upload
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
383 info_msg = _(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
384 "DokuWiki media files will be *downloaded* to {temp_dir} - to finish the import you have to upload them *manually* to {media_repo}"
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
385 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
386 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
387 media_repo = DEFAULT_MEDIA_REPO
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
388 if opt_upload_images:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
389 info_msg = _(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
390 "DokuWiki media files will be *uploaded* to the XMPP server. Hyperlinks to these media may not been updated though."
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
391 )
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
392 else:
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
393 info_msg = _(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
394 "DokuWiki media files will *stay* on {location} - some of them may be protected by DokuWiki ACL and will not be accessible."
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
395 )
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
396
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
397 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
398 namespace = options["namespace"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
399 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
400 namespace = DEFAULT_NAMESPACE
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
401 try:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
402 limit = options["limit"]
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
403 except KeyError:
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
404 limit = DEFAULT_LIMIT
1844
489b968b3723 plugin blog_import_dokuwiki: also uses the generic image uploader from blog_import (when media_repo is empty and OPT_UPLOAD_IMAGES is True)
souliane <souliane@mailoo.org>
parents: 1843
diff changeset
405
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
406 dk_importer = Importer(location, user, passwd, media_repo, limit)
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
407 info_msg = info_msg.format(
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
408 temp_dir=dk_importer.temp_dir, media_repo=media_repo, location=location
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
409 )
4037
524856bd7b19 massive refactoring to switch from camelCase to snake_case:
Goffi <goffi@goffi.org>
parents: 3479
diff changeset
410 self.host.action_new(
2624
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
411 {"xmlui": xml_tools.note(info_msg).toXml()}, profile=client.profile
56f94936df1e code style reformatting using black
Goffi <goffi@goffi.org>
parents: 2562
diff changeset
412 )
1841
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
413 d = threads.deferToThread(dk_importer.process, client, namespace)
7717975b3ec3 plugin blog_import_dokuwiki: first draft
souliane <souliane@mailoo.org>
parents:
diff changeset
414 return d