wikifirc

filter irc.wikimedia.org on specific pages and users
git clone https://a3nm.net/git/wikifirc/
Log | Files | Refs | README

commit ade64d5fd27605ef57bffd5a96184f88ca3213bb
parent 31c6f42dd6c4651324ae9391825304fac230e79a
Author: Antoine Amarilli <a3nm@a3nm.net>
Date:   Tue, 15 Nov 2016 11:21:48 +0100

save modified pages automatically

Diffstat:
README | 23+++++++++--------------
wikifirc | 41+++++++++++++++++++++++++++--------------
2 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/README b/README @@ -33,11 +33,11 @@ followed. == 2. How to use == Run as wikifirc ADMIN [DUMP]. DUMP is a file that will be read at startup to -populate the list of followed users and pages. ADMIN should be the nick of the -program administrator or irc.wikimedia.org. Events from irc.wikimedia.org should -be provided in the irctk format on standard input, the data of matching lines -will be sent on standard output (to be piped in irctk) and newly followed pages -on standard error. +initially populate the list of followed users and pages, and written afterwards +to update it. ADMIN should be the nick of the program administrator or +irc.wikimedia.org. Events from irc.wikimedia.org should be provided in the irctk +format on standard input, the data of matching lines will be sent on standard +output (to be piped in irctk) and newly followed pages on standard error. The administrator can send the following messages to the program via an IRC query: @@ -50,23 +50,18 @@ Note that the program does not acknowledge commands. The administrator is only authentified by their nick, so keep potential security risks in mind. Here is an example of how you can use the program, using irctk -<http://a3nm.net/git/irctk/>, sponge <http://joeyh.name/code/moreutils/> and -the provided ircfilter.py: +<http://a3nm.net/git/irctk/> and the provided ircfilter.py: irctk -q uniquebotnick@irc.wikimedia.org \#fr.wikipedia \#en.wikipedia | ./ircfilter.py | - ./wikifirc adminnick pagelist 2> >(sponge pagelist) | + ./wikifirc adminnick pagelist | ./irctk -q botnick@yourserver \#yourchannel This call reads recent changes on the English and French Wikipedia, pipes it to -ircfilter.py to avoid encoding issues, then pipes it to wikifirc (followed pages -are read from pagelist at startup and written to pagelist on exit) and pipes it -to your own IRC server. The bot can be controlled by /msg'ing uniquebotnick as +ircfilter.py to avoid encoding issues, then pipes it to wikifirc and pipes it to +your own IRC server. The bot can be controlled by /msg'ing uniquebotnick as adminnick on irc.wikimedia.org. -Note that if wikifirc crashes for some reason, then the Python backtrace will -garble pagelist. It's no big deal, but it's cleaner to remove it. - If some of your users have no color code support, you can add the following before the last irctk call: diff --git a/wikifirc b/wikifirc @@ -64,12 +64,12 @@ class Change: shorten(self.diff), )) -def register(pages, page): +def register(pages, page, fout): """add a page to a set of pages and output it if not already present""" if page in pages: return pages.add(page) - print(page, file=sys.stderr) + print(page, file=fout) if __name__ == "__main__": @@ -81,19 +81,30 @@ if __name__ == "__main__": print ("Usage: %s ADMIN [DUMP]" % sys.argv[0]) sys.exit(1) + dump = None + try: dump = sys.argv[2] - f = open(dump, 'r') - while True: - line = f.readline() - if not line: - break - register(pages, line.rstrip()) - f.close() except IndexError: pass - except FileNotFoundError: - pass + + # load pages + if dump: + try: + f = open(dump, 'r') + while True: + line = f.readline() + if not line: + break + register(pages, line.rstrip()) + f.close() + except FileNotFoundError: + pass + + # now, prepare to save pages + fout = None + if dump: + fout = open(dump, 'w') while True: data = sys.stdin.readline() @@ -112,10 +123,10 @@ if __name__ == "__main__": if command == "user": # register it as a user for namespace in user_namespaces: - register(pages, namespace + value) + register(pages, namespace + value, fout) elif command == "page": # register it as a page - register(pages, value) + register(pages, value, fout) else: # bad command, fail noisily raise ValueError @@ -125,7 +136,9 @@ if __name__ == "__main__": line = Change(project, data) # a user is followed if its user page is followed if user_namespaces[0] + line.username in pages: - register(pages, line.page) + register(pages, line.page, fout) if line.page in pages: print(line) + fout.close() +