This is (extreme) overkill at this point, but I'm thinking ahead to some
of the transformations that I want to be able to support, and this
refactor should make it much more feasible to have a good process.
The main changes here are using a class-based system where each
transformation can also define a method for checking whether it should
apply to any given url (probably often based on domain), as well as a
new process that will repeatedly apply all transformations until the url
passes through all of them unchanged.
This script can be scheduled as a cronjob, and will dump the database,
compress and GPG-encrypt it, and upload to an FTP. Afterwards, it will
also delete any backups older than the specified retention periods, both
locally as well as on the FTP (with individual retention periods).
Since the check to see if a password has been present in a data breach
is using a Bloom filter, there's a small chance of false positives (I
believe it's 0.1% currently). This is confusing when it happens, so this
just clarifies that it's possible but they'll have to pick a new
password anyway.
Previously, dates were always displayed in the relative, "ago" style,
but these become pretty unwieldy for longer time spans, especially since
the ago library I'm using jumps directly from days to years, so it will
show ones like "200 days ago" that are hard to place.
This adds an "adaptive" date display method and uses it almost
everywhere instead. These will use the "ago" style for shorter periods,
and then switch to showing an absolute date for dates longer ago than
that. The threshold for the switch is currently set to 7 days.
This should be more correct overall, and in the (rare) cases where the
destination changes (due to a topic being moved to a different group or
something similar), the site is able to handle it with a 302 redirect
after the initial one from the shortener.
I did a bad job of testing how this would work, and didn't account
properly for people being considered not-logged-in while they're on
tild.es. This should fix the issues, but it re-adds the minor "data
leak" of people without an invite being able to determine some data,
such as existing users and groups. Since the site is going to be
publicly visible in the near future, I don't think this is a significant
concern.
Links aren't displayed/used anywhere yet, but this should be the basic
setup needed for a simple link-shortener on the tild.es domain.
Currently, it will support two uses:
* https://tild.es/asdf - redirect to topic with id "asdf"
* https://tild.es/~asdf - redirect to group "~asdf"
The update involved some refactoring on their end that eliminated
_LabelWrapper, so this required a minor change (but I don't know why I
was doing it that way in the first place).
When a user enables one of the user settings that causes external links
to open in new tabs, we should be adding rel="noopener" to the links as
well, for security reasons:
https://mathiasbynens.github.io/rel-noopener/
Yet another issue with Bleach 3.0's linkification: when used via the
filter as part of sanitization (which is necessary right now due to
*another* issue where it escapes valid HTML tags), it doesn't properly
linkify urls that contain an ampersand.
As a (temporary?) workaround, this stops using Bleach's linkification
entirely and switches to cmark-gfm's "autolink" extension. These aren't
perfectly equivalent, and the switch results in two other issues that I
consider more minor than links including ampersands not working:
- autolink will initially create links for ftp:// urls and email
addresses. The final sanitization will remove these links due to the
protocol whitelist, but it will leave behind a bare <a> tag. So the
text will *appear* linked but not actually link to anything. If I
decide to stick with autolink, it should be pretty straightforward to
fix this by stripping all bare <a> tags from the final HTML.
- autolink doesn't create links for bare domains. For example, writing
"example.com" won't result in a link, it's necessary to write
"www.example.com" or include a protocol like "http://example.com".
This doesn't really change anything functionally, but it gives users
specific links that they can send out that will pre-fill the invite code
on the registration page, instead of requiring them to copy-paste it or
type it in themselves.
No functional difference, but probably a little better to use default
values where possible instead of specifying particular ways of setting
it back to default (especially for things like topic_type).
This shouldn't make much difference in practice since deleted_time
should be updated automatically, but it's probably safest to make sure
that the posts are still deleted before cleaning them up, in case the
deleted_time ends up staying set somehow.
Some new fields have been added to comments and topics recently
(excerpt, original_url), but they weren't added to the cleanup script
yet. There was also no reason to keep information about whether the
posts had been edited or not.
As of version 3.0, the redis-py package no longer has a distinction
between its Redis and StrictRedis classes, and both behave the same
(StrictRedis is just an alias for Redis).
This would have continued working as-is, but we might as well switch it
back to the normal name now that StrictRedis doesn't have any benefit.
Pyramid 1.10 adds config.route_prefix_context(), which can be used like
this to organize routes under a prefix a little more easily. In the next
release it should also be possible to move the route for the prefix
itself inside the context manager with an empty pattern and an
inherit_slash=True argument.
See https://github.com/Pylons/pyramid/pull/3420
There are a few significant updates in here (Pyramid 1.10, redis 3.0,
etc.) but they don't have any major effects, so this should all be safe
and I can do some additional updates in follow-up commits.
Until now, if people want their account deleted, I've just been banning
it. This will let me do it properly, but some additional cleanup should
be added in the future once I think through what's safe to get rid of
from deleted accounts.
cmark-gfm's behavior seems to have changed when I upgraded it, and it's
now producing " entities instead of normal double-quote characters.
This was breaking syntax-highlighting, so this simply replaces all the
entities back to normal double-quotes. This is maybe a little risky to
do without doing it as part of proper HTML parsing, but I think it
should overall be pretty safe.
Saying "posts" when the user might have bookmarks of the other type
(topics/comments) wasn't quite right, so this just makes it more
specific and fixes up the related post_type code a little.
Pagination isn't working correctly for bookmarks yet, and it seems like
it'll take some fiddly work to be able to make PaginatedQuery be able to
handle using the bookmark's created_time instead of the post's.
This just simplifies the method by using an inner join on the bookmarks
instead of an EXISTS subquery, and totally removes pagination for now.
Bleach supports adding html5lib filters to the cleaning process, which
is how its linkify() process works, as well as being how I implemented
the custom Tildes linkification for usernames and group names.
This commit switches to adding those filters into the clean() call,
which makes it so everything is now done in a single pass instead of
three. In addition, this also fixes the issues we were having with
bleach "fixing" things that it thought were invalid HTML (when they were
just people writing things inside angle brackets).
Continuing on with the theme-system overhaul, this adds some more
definable colors, some more "fallback" logic, and moves Solarized colors
into a particular theme file instead of using them throughout the CSS.
Solarized colors are still used in the base theme, but all are hardcoded
and labeled as such, so that they can be spotted/replaced more easily.
This replaces the previous CSS theme system with a new one based around
using SCSS map-merge to define different "color sets" that can overwrite
the base theme to generate new themes.
The original theme system was heavily based around Solarized with only
light and dark variants, and other themes weren't able to replace many
of the colors (link colors, buttons, alerts, etc.). This new system will
make it possible to have themes that are completely unrelated to the
Solarized variants.
Of course right after I did that last update, they released two more new
versions of cmark-gfm, so here we go again.
The .18 update defaults to "safe" mode, but I want to disable that and
leave sanitization up to Bleach, so this required changing the options.