This is just using the recommendations from PGTune for a web application
being hosted on a server with the prod server's specs. I'm sure they're
not the best values, but should be better than the defaults.
This removes RabbitMQ as well as everything else attached to it:
Erlang; the Prometheus collector; the pg-amqp-bridge and all PostgreSQL
functions and triggers; and the amqpy Python package and the Tildes code
that used it.
Note that this commit does not actually uninstall or delete any of these
packages or services, so if you have a running instance that you want to
keep (instead of re-provisioning from scratch), you will need to
manually remove them if you want them completely gone.
RabbitMQ was used to support asynchronous/background processing tasks,
such as determining word count for text topics and scraping the
destinations or relevant APIs for link topics. This commit replaces
RabbitMQ's role (as the message broker) with Redis streams.
This included building a new "PostgreSQL to Redis bridge" that takes
over the previous role of pg-amqp-bridge: listening for NOTIFY messages
on a particular PostgreSQL channel and translating them to messages in
appropriate Redis streams.
One particular change of note is that the names of message "sources"
were adjusted a little and standardized. For example, the routing key
for a message caused by a new comment was previously "comment.created",
but is now "comments.insert". Similarly, "comment.edited" became
"comments.update.markdown". The new naming scheme uses the table name,
proper name for the SQL operation, and column name instead of the
previous unpredictable terms.
gunicorn 20.0.0 included a change so that it will no longer read server
configuration out of Paster files. Because of this, the settings for it
in development.ini and production.ini were no longer being used. This
resulted in the auto-reloading no longer working in dev, and the number
of workers being reduced back down to 1 in production. The socket/PID
may have been impacted as well.
This commit moves the configuration into command-line args used to
launch gunicorn, and uses a pillar variable to handle the args different
between dev and prod.
This makes it so that posts (both topics and comments) can no longer be
voted on after they're over 30 days old. An hourly cronjob makes this
"official" by updating a flag on the post indicating that voting is
closed. The daily clean_private_data script then deletes all individual
vote records for posts with closed voting, and the triggers on the
voting tables have been updated to not decrement the vote totals when
these deletions happen.
The net result of this is that Tildes only stores users' votes for a
maximum of 30 days, removing a lot of sensitive/private data that builds
up over the long term.
This didn't get updated when boussole was split out to its own
virtualenv, and was still being linked to the pip installs from the
application succeeding.
Previously, the virtualenvs were owned by root and the pip installs were
done as root as well. This worked fine, but it meant that I can't use
pip-tools' pip-sync function without sudo. This makes it simpler by
giving ownership to the app user (tildes in prod, vagrant in dev).
I'm going to start using pip-tools to manage dependencies:
https://github.com/jazzband/pip-tools
This makes updating the dependencies and virtualenv easier in a few
ways, and makes it simple to keep dev dependencies split out (so I can
stop installing them in production).
Now, to do a check and update all packages to their newest versions, the
main command is:
pip-compile --no-header --upgrade requirements.in
and again with requirements-dev.in to update that one as well. This will
update all the package versions in requirements.txt and
requirements-dev.txt. The virtualenv can then be updated to match those
versions by running:
pip-sync requirements.txt
(or requirements-dev.txt for dev environment). This currently needs to
be run with sudo, but I'm going to try to fix that shortly.
This installs PL/Python (specifically plpython3u), enables it in the
database, and creates a function id36_to_id that calls the Python
function with the same name inside the tildes.lib.id module. This will
enable doing queries similar to this, when I have a topic's ID36 from
the site:
SELECT * FROM topics WHERE topic_id = id36_to_id('asdf');
The fact that this was possible to set up without having to port the
id36_to_id logic to a different language is blowing my mind a little.
There are some really interesting possibilities from being able to
import all of the Python code into the database itself.
Changing these pillar values are the only actual changes to Tildes
code/config needed, but if you're upgrading an existing version from 10
to 12 you will need to do some manual steps. The below should cover it -
lines starting with a * are descriptions of things you need to do, while
the rest are actual commands to run:
sudo apt-get install postgresql-12
sudo systemctl stop postgresql@10-main.service
sudo systemctl stop postgresql@12-main.service
cd /var/lib/postgresql
sudo -u postgres /usr/lib/postgresql/12/bin/pg_upgrade -b /usr/lib/postgresql/10/bin/ -B /usr/lib/postgresql/12/bin/ -d /var/lib/postgresql/10/main/ -D /var/lib/postgresql/12/main/ -o '-c config_file=/etc/postgresql/10/main/postgresql.conf' -O '-c config_file=/etc/postgresql/12/main/postgresql.conf'
* Change pillar value to 12, and run salt
sudo systemctl stop postgresql@10-main.service
* Edit /etc/postgresql/12/main/postgresql.conf and change port to 5432
sudo systemctl restart postgresql@12-main.service
sudo -u postgres ./analyze_new_cluster.sh
* After verifying the new version seems to be working, clean up the old version:
sudo apt-get remove postgresql-10
sudo rm -rf /usr/lib/postgresql/10/
sudo rm -rf /var/lib/postgresql/10/
sudo rm -rf /etc/postgresql/10/
This adds the backend for scheduled topics, which can be set up to post
at a certain time and then (optionally) repeat on a schedule.
Currently, these topics must be text topics, and can have their title,
markdown, and tags set up. They can be configured to be posted by a
particular user, but if no user is chosen they will be posted by a
(newly added) generic user named "Tildes" that is intended to be used
for "owning" automatic actions like this.
Apparently add_header inside a location block doesn't... you know,
actually work. This should be reasonable, but I'd still rather only
allow the Stripe JS on the single page where it's necessary.
The previous method of doing this could cause redis to try to start up
(via restart) earlier than it should. By using require_in and watch_in,
it should now only start up in the first place once this service has
been started first, and it will also cause redis to restart if it ever
needs to run again in the future.
Vagrant on Windows has issues with creating symlinks inside shared
folders - it requires a permission that isn't granted to a user by
default. This can be fixed by changing security policies, but for our
purposes we don't need the symlinks anyway, and can run the tools
manually like this, instead of using the .bin/ symlinks.
Previously tild.es urls would proxy_pass through to the views inside the
Pyramid app, but this caused strange behavior in some cases. For
example, anything that caused a 404 response would end up in a broken
page that still appeared to be on the tild.es domain, but would be an
HTML-only page coming from the app, since the CSS and JS would not be
available.
This method is still a bit weird in some ways (now you'll end up on a
404 page at https://tildes.net/shortener/... instead), but I think it's
an improvement overall.
This changes the "activity" topic-sorting method to look for
"interesting" activity instead of everything, and adds a new "All
activity" method that retains the previous behavior.
Currently, "interesting activity" excludes any comments that have active
Noise, Offtopic, or Malice labels, or any of their children. These
checks are also done based on labeling activity, so for example if
someone posts a new comment it will bump the thread initially, but if
that comment is then labeled as Noise, the thread will "un-bump" and go
back to its previous position in the Activity sort.
There were also some other minor changes made to appearance to support
adding another sorting option, such as shortening the displayed names on
the "tabs", like showing "Votes" instead of "Most votes". This probably
needs some further work, but is okay for now.
This won't affect requests for static files or anything except ones that
get proxied to the app.
The current configuration is based on IP, and allows a rate of 4/sec,
with an additional burst of 5 above the limit permitted, and burst
requests allowed to go through immediately (nodelay). For more info:
https://www.nginx.com/blog/rate-limiting-nginx/
Sometimes the database initialization fails, generally due to some
earlier step in the setup having an issue. Even if a re-provision
resolves that issue, the database init wouldn't be re-run since it was
set up to only happen after the database was created.
This changes it so that it will try to select from the users table, and
if that fails it will re-run the initialization.
This allows groups to have wiki pages. The rendered form of the page is
stored in the database, but the markdown source is kept on the
filesystem, using git to maintain the history (by doing a commit on
every edit).
A lot of aspects of this are still quite rough, but it should be a
decent start.
This sets up a cronjob that will run every hour to select the most
common tags used in a group (up to 100), and store them in a new column
in the groups table. This will be used to populate the list of tags to
use for autocompletion.
This redirect being first in the file meant that if someone tried to
access a dev version through any method except using "localhost" (such
as via the IP address), no server block would be matched, which causes
nginx to use the first one. That resulted in a 301 redirect to
tildes.net, which definitely shouldn't happen for a dev version.
This change both moves the redirect to the bottom, as well as only
adding it if it's the "prod" environment, since it's not needed in the
dev environment at all.
Starting with psycopg2 version 2.8, the package on pypi no longer
contains a binary version and must be compiled from source. These two
packages are required for this to be possible.
It would have been simpler to just switch to the psycopg2-binary
package, however that isn't a very good solution overall since many
other packages treat "psycopg2" as the dependency that they want
installed, not "psycopg2-binary". Overall, this situation is pretty
messy and I'm not sure what will end up being the final state, but this
should work for now.
More info about the source-only change:
http://initd.org/psycopg/articles/2018/02/08/psycopg-274-released/
Previously, this was set as "same-origin" which will only send a
referrer to Tildes itself. This changes so that it will continue sending
the full referrer to Tildes, but will send only the domain to external
sites if they use HTTPS (and no referer to HTTP ones).
This can be useful because there are often situations where an article
author sees traffic coming from a site and will come to check it out and
be able to participate in the discussion.
The site-icons spritesheet has already become unwieldy - it's almost
1MB, is mostly rarely-needed icons, and needs to be fully replaced and
re-downloaded whenever a new icon is added. With HTTP/2 now being widely
supported, spritesheets seem to be mostly obsolete, and I probably never
should have done it that way in the first place.
This commit changes over to simply using individual icon images, and
rebuilds the CSS file whenever new icons are downloaded. This new CSS
file will probably be somewhat large, but should gzip extremely well.
This probably still needs some work to support cache-busting on the CSS
file.
I've been reading a little about PostgreSQL transaction ID wraparound
today, and how it's knocked multiple companies out of commission for
days to get it resolved. It should have almost no chance of happening on
Tildes for years, but this will let me set up some monitoring for it
now, while I'm thinking about it.
For more info:
https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres.html
A lot of the code in common between this and the EmbedlyScraper should
probably be generalized out to a base class soon, but let's make sure
this works first.
The monitoring server needs Redis, but not the separate server that's
used for the breached-passwords bloom filter in dev/prod. This splits
that server out to its own state, so that it doesn't need to be set up
on the monitoring server.
Some of these states were built entirely around a single-server approach
(Prometheus + monitoring being on the same server as the site), and the
files have needed modifications to work with a separate monitoring
server.
This updates the states so that it should all happen as expected in all
types of environments.