From: b
Date: Tue, 12 Aug 2014 07:51:40 +0000 (+0000)
Subject: v.1 i.0
X-Git-Tag: v1.0
X-Git-Url: http://bicyclesonthemoon.info/git-projects/?a=commitdiff_plain;h=6759a55759d97ad4840736320441b0751c183aef;p=ott%2Fmirror
v.1 i.0
First publicly available version.
git-svn-id: svn://botcastle1b/ottmirror@1 23ac2ed3-cec8-4626-8109-7118d8ca9799
---
6759a55759d97ad4840736320441b0751c183aef
diff --git a/botmlogo2.png b/botmlogo2.png
new file mode 100644
index 0000000..520f68a
Binary files /dev/null and b/botmlogo2.png differ
diff --git a/index.htm b/index.htm
new file mode 100644
index 0000000..f66b5f9
--- /dev/null
+++ b/index.htm
@@ -0,0 +1,258 @@
+
+
+ If you want to run your own copy of the ЯOЯЯIM TTO, you
+can, because I made it available. Follow these instructions to download and
+setup your own ЯOЯЯIM TTO. Go to where you downloaded the source. Open You have to set up server so that some URLs will link to some CGI programs.
+ That's how I did it in apache2: bot2 (bothasar_t) is the thread archiving bot. bot3 (bothasar_p) is the post bot. The bots won't run on themselves. They need something that will start them at
+regular time intervals. I will share my
+ If you managed to do everything described here you should have your own ЯOЯЯIM
+TTO. Now wait for it until it catches the whole thread. (Unless I forgot about something important here. Let me know if I did)You can have your own OTT mirror now!
+Dependencies
+The mirror depends on some things:
+
+
+
+
+The mirror was written and tested on Debian, and then Cubian. Other versions of
+GNU/Linux should be okay.
+I use apache2. Any other server that supports CGI should be okay.
+gawk should be compatible, so you can use it too.
+The mirror is written in C++, but doesn't use anything C++-specific. I may
+convert it to C in the future.
+There is a bug in cgilib. At least the version from the Debian repository.
+Change line 146 of cgi.h
from "extern }
" to "}
"
+The bots aren't daemons and need something to start them in regular time
+intervals. I use cron.Downloading
+
+
+
+something.example.com/ott
"
+Preparing the enviroment
+re.awk
. At hte top of
+the file there are some paths defined. You'll have to change some of these. They
+have to be double-escaped. The directories should not end with a
+"/
".
+
+
+wgetpath
to point to wget. Set mawkpath
to
+point to mawk.propath
to point there.tmppath
to point there.mempath
to point there. Create a file named "name
"
+there. Put your URL-encoded xkcd fora login in the first line and your
+URL-encoded password in the second line. Only you should have read access to the
+file.
+Create the following subdirectories:
+
+
+Only you should have read access to mlist
mpost
mpost/fail
mpost/ok
mpost/pm
mpost
. Passwords can be found
+there.logpath
to point there.
+mirrpath
to point at the directory where the mirror is
+hosted. That's where you put the contents of ott.zip
useragent2
is used by bothasar_t, the
+mirror bot. useragent3
is used by bothasar_p, the post bot.own_image_regexp
to match the URL where the images are
+archived. The URL should be something like
+"something.example.com/ott/image/
".own_url_encoded
to your URL-encoded mirror URL.makefile
.
+propath
points to.
+Unless it's the same directory.mirrpath
points to. You may want to edit the
+following files:
+
+
+so that you can set your own title image, links, etc.np/top
- the newpage top.np/end
- the newpage bottom.top
- the indexpage top.end
- the indexpage bottom.post.htm
- the post page.info.htm
- the information page.update.htm
- the update page.
+Some lines look like ###this
. Don't change them. The programs will
+place their content there.Setting up the server
+
+
+/ott/log
" should link to "logpath
"./ott/view
" should link to "propath/view
"./ott/mview
" should link to "propath/mview
"./ott/update
" should link to "propath/update
"./ott/index
" should link to "propath/index
"./ott/post
" should link to "propath/post
"./ott
" should link to "propath/index
".
+
+
+Alias /ott/log /eizm/log/ottmirror
+
+<Directory "/eizm/log/ottmirror">
+âOptions Indexes
+âOrder allow,deny
+âAllow from all
+</Directory>
+
+ScriptAlias /ott/view /eizm/pro/ottmirror/view
+ScriptAlias /ott/mview /eizm/pro/ottmirror/mview
+ScriptAlias /ott/update /eizm/pro/ottmirror/update
+ScriptAlias /ott/index /eizm/pro/ottmirror/index
+ScriptAlias /ott/post /eizm/pro/ottmirror/post
+ScriptAliasMatch ^/ott/?$ /eizm/pro/ottmirror/index
+Commandline parameters for bot2
+
+
+
+
+
-i
- the bot ID
+ There can be multiple copies of the bot. Each should have its own ID. They
+ have their own tempfiles, their own logs, and their own memory of what was
+ the last newpage. The bot will refuse to start when there is another bot
+ with the same ID still running. The default value is 0. Don't use the value
+ 4 because it's already used in update.1.cpp
.-o
- offset from the prievious page
+ It determines where to start relative to the last page saved on the previous
+ run (of the same ID). 0 means downloading the next page, 1 the same page,
+ 2 the previous page, 3 the previousprevious page, and so on. If this is the
+ first run, it will start from the first page. The default value is 1.-m
- maximal number of pages in one run
+ The bot will not download more pages than this. The default value is 1.-p
- distance from the last page
+ The bot will always stay at least this many pages away from the latest page
+ on the thread. Default value is 0.-s
- start position override
+ The bot will start from this page regardless of where it stopped last time
+ -w
- wait between pages
+ Time in seconds to wait between downloading pages. The default value is 3.
+ -v
- wait after download
+ Time in seconds to wait after downloading an avatar, attachment or image.
+ The default value is 15.
+
+-d
- download avatars
+ If this option is set, the bot will download avatars.-a
- download attachments
+ If this option is set, the bot will download attachments.-b
- download images
+ If this option is set, the bot will download images that aren't avatars or
+ attachments.-t
- stdout
+ If this option is set, the bot will write to standard output instead of the
+ log file.-n
- new log
+ If this option is set, the bot will replace the old log. Otherwise it will add
+ to it.-r
- stay in present
+ If this option is set, after reaching the last page the bot will stay there
+ and will also download the index. Otherwise it will continue from the first
+ page.Commandline parameters for bot3
+
+
+
+
+
-w
- wait between posts
+ Time in seconds to wait between sending posts. Or between a failed post and
+ sending a PM. The default value is 15.
+
+
+-t
- stdout
+ If this option is set, the bot will write to standard output instead of the
+ log file.-n
- new log
+ If this option is set, the bot will replace the old log. Otherwise it will add
+ to it.Scheduling the bots
+crontab
configuration.20,50 * * * * /eizm/pro/ottmirror/bot2 -i2 -r -o1 -m5 -w9 -v5 -d -a
+
+This bot runs every 30 minutes and updates the pages with new posts.
+
+26 23 * * * /eizm/pro/ottmirror/bot2 -i3 -r -o5 -m10 -p2 -w9 -v5 -d -a -b
+
+This bots runs every day and reloads the pages with new posts and 4 previous
+pages except the two last pages. Because there could be a delurker or someone
+could make edits which we don't want to miss.
+
+23 * * * * /eizm/pro/ottmirror/bot2 -i1 -o0 -m3 -p2 -w9 -v5 -d -a -b
+
+This bot runs every hour and goes slowly through the whole thread except the two
+latest pages. Because there can be temporal edits. (like the 1300 and 1800
+repositories).
+
+While the thread being catched up for the first time I'd recommend running only
+one copy of the bot at a faster rate than this.
+
+7,27,47 * * * * /eizm/pro/ottmirror/bot3 -w15
+This bot runs every
+20 minutes and looks for new posts to be sent to the thread.
+
+0 0 * * 1 /bin/mv /eizm/log/ottmirror/bot2.log.1 /eizm/log/ottmirror/bot2.log.1.lastweek
+0 0 * * 1 /bin/mv /eizm/log/ottmirror/bot2.log.2 /eizm/log/ottmirror/bot2.log.2.lastweek
+0 0 * * 1 /bin/mv /eizm/log/ottmirror/bot2.log.3 /eizm/log/ottmirror/bot2.log.3.lastweek
+0 0 * * 1 /bin/mv /eizm/log/ottmirror/bot3.log /eizm/log/ottmirror/bot3.log.lastweek
+
+This moves the log files once in a week. Otherwise they would grow to infinity.
+Bugs
+In line 180 of bot2.1.awk
change "if(arr3[2]==404)
" to
+"if(arr3[2]>=400&&arr3[2]<500)
". This will be fixed
+in next update.
+Congratulations
+
+
+
+
+
Users browsing this forum: NaN registered users and NaN guests
+You cannot read about your forum permissions