annotate mod_export_skeletons/README.md @ 5173:460f78654864

mod_muc_rtbl: also filter messages This was a bit tricky because we don't want to run the JIDs through SHA256 on each message. Took a while to come up with this simple plan of just caching the SHA256 of the JIDs on the occupants. This will leave some dirt in the occupants after unloading the module, but that should be ok; once they cycle the room, the hashes will be gone. This is direly needed, otherwise, there is a tight race between the moderation activities and the actors joining the room.
author Jonas Schäfer <jonas@wielicki.name>
date Tue, 21 Feb 2023 21:37:27 +0100
parents 17fbe82d4bfe
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
4815
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
1 ---
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
2 summary: Export message archives in sanitized minimal form for analysis
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
3 ---
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
4
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
5 Exports message archives in a format stripped from private information
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
6 and message content.
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
7
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
8 # Usage
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
9
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
10 prosodyctl mod_export_skeletons [options] user@host*
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
11
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
12 Multiple user JIDs can be given.
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
13
4816
e7d1d68f0279 mod_export_skeletons: Document archive name override option
Kim Alvefur <zash@zash.se>
parents: 4815
diff changeset
14 ## Options
e7d1d68f0279 mod_export_skeletons: Document archive name override option
Kim Alvefur <zash@zash.se>
parents: 4815
diff changeset
15
4817
e8e0cb97c480 mod_export_skeletons: Fix override docs
Kim Alvefur <zash@zash.se>
parents: 4816
diff changeset
16 `--store=archive`
e8e0cb97c480 mod_export_skeletons: Fix override docs
Kim Alvefur <zash@zash.se>
parents: 4816
diff changeset
17 : For overriding the store name, e.g. for compat with `archive2` or
4816
e7d1d68f0279 mod_export_skeletons: Document archive name override option
Kim Alvefur <zash@zash.se>
parents: 4815
diff changeset
18 querying MUC archives with `muc_log`
e7d1d68f0279 mod_export_skeletons: Document archive name override option
Kim Alvefur <zash@zash.se>
parents: 4815
diff changeset
19
4815
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
20 `--start=timestamp`
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
21 : Start of time span to export in [XEP-0082] format
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
22
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
23 `--end=timestamp`
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
24 : End of time span to export in [XEP-0082] format
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
25
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
26 # Output
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
27
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
28 All content is stripped, leaving only the basic XML structure, with
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
29 child tags sorted.
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
30
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
31 Top level attributes are given special treatment since they carry
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
32 protocol semantics. Notably the `@to` and `@from` JIDs are replaced by
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
33 symbolic labels to convey what form (bare, full or host) they had. The
4818
d66162e850cd mod_export_skeletons: Generate ids based on log2 of the original length
Kim Alvefur <zash@zash.se>
parents: 4817
diff changeset
34 `@id` attribute is replaced with a string with the length based on log2
d66162e850cd mod_export_skeletons: Generate ids based on log2 of the original length
Kim Alvefur <zash@zash.se>
parents: 4817
diff changeset
35 of the original length.
4815
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
36
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
37 ## Example
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
38
4819
b1882a40c246 mod_export_skeletons: Update examples too
Kim Alvefur <zash@zash.se>
parents: 4818
diff changeset
39 ``` xml
b1882a40c246 mod_export_skeletons: Update examples too
Kim Alvefur <zash@zash.se>
parents: 4818
diff changeset
40 <message from='full' id='xxxxx' type='chat' to='bare'><body/><x xmlns='jabber:x:oob'><url/></x></message>
b1882a40c246 mod_export_skeletons: Update examples too
Kim Alvefur <zash@zash.se>
parents: 4818
diff changeset
41 <message from='bare' id='xxxxx' type='error' to='full'><error><remote-server-not-found xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/><text xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/></error></message>
b1882a40c246 mod_export_skeletons: Update examples too
Kim Alvefur <zash@zash.se>
parents: 4818
diff changeset
42 <message from='full' id='xxxxx' type='chat' to='bare'><body/><x xmlns='jabber:x:oob'><url/></x></message>
b1882a40c246 mod_export_skeletons: Update examples too
Kim Alvefur <zash@zash.se>
parents: 4818
diff changeset
43 <message from='full' id='xxxxxx' type='normal' to='bare'><x xmlns='jabber:x:conference'/></message>
4815
9c2af2146ee2 mod_export_skeletons: Command to aid in analysis of archive contents
Kim Alvefur <zash@zash.se>
parents:
diff changeset
44 ```