I think this is a good idea for a few reasons, including accessibility
(people that have difficulty distinguishing the link color will still be
able to recognize links).
This is a bit flimsy, but when I started looking at applying the
existing transformations to old posts, I found the Paradox forums as an
example of links that became broken after they were processed (because
"fixing" their links ends up breaking them).
This will give a way to exempt any other domains or urls that end up
being a problem, though over the long term it would probably be better
to make this database-based instead of code-based.
webargs 5.0.0 makes a behavior change to its @use_args / @use_kwargs
decorators that makes it so that optional/missing arguments are no
longer filled: https://github.com/marshmallow-code/webargs/issues/342
This breaks how I'm currently doing some views with missing arguments
(such as views.topic.get_group_topics), so to be able to upgrade to 5.0
we will need to either update the views or the schemas.
When a topic tag was using the hierarchy (for something like
"science.biology"), it would set an invalid class on the tag label,
because it would include the period(s), which can't be used in CSS
classes. This fixes it so that periods will be replaced with dashes.
This is (extreme) overkill at this point, but I'm thinking ahead to some
of the transformations that I want to be able to support, and this
refactor should make it much more feasible to have a good process.
The main changes here are using a class-based system where each
transformation can also define a method for checking whether it should
apply to any given url (probably often based on domain), as well as a
new process that will repeatedly apply all transformations until the url
passes through all of them unchanged.
This script can be scheduled as a cronjob, and will dump the database,
compress and GPG-encrypt it, and upload to an FTP. Afterwards, it will
also delete any backups older than the specified retention periods, both
locally as well as on the FTP (with individual retention periods).
Since the check to see if a password has been present in a data breach
is using a Bloom filter, there's a small chance of false positives (I
believe it's 0.1% currently). This is confusing when it happens, so this
just clarifies that it's possible but they'll have to pick a new
password anyway.
Previously, dates were always displayed in the relative, "ago" style,
but these become pretty unwieldy for longer time spans, especially since
the ago library I'm using jumps directly from days to years, so it will
show ones like "200 days ago" that are hard to place.
This adds an "adaptive" date display method and uses it almost
everywhere instead. These will use the "ago" style for shorter periods,
and then switch to showing an absolute date for dates longer ago than
that. The threshold for the switch is currently set to 7 days.
This should be more correct overall, and in the (rare) cases where the
destination changes (due to a topic being moved to a different group or
something similar), the site is able to handle it with a 302 redirect
after the initial one from the shortener.
I did a bad job of testing how this would work, and didn't account
properly for people being considered not-logged-in while they're on
tild.es. This should fix the issues, but it re-adds the minor "data
leak" of people without an invite being able to determine some data,
such as existing users and groups. Since the site is going to be
publicly visible in the near future, I don't think this is a significant
concern.
Links aren't displayed/used anywhere yet, but this should be the basic
setup needed for a simple link-shortener on the tild.es domain.
Currently, it will support two uses:
* https://tild.es/asdf - redirect to topic with id "asdf"
* https://tild.es/~asdf - redirect to group "~asdf"
The update involved some refactoring on their end that eliminated
_LabelWrapper, so this required a minor change (but I don't know why I
was doing it that way in the first place).
Previously I was using pyenv to build Python, but that's mostly
unnecessary and has some other side effects (like needing to install a
lot of packages as dependencies).
This switches to using the deadsnakes PPA instead, which also has the
effect of upgrading to the most recent version of Python 3.6 (currently,
3.6.7 instead of 3.6.5).
When a user enables one of the user settings that causes external links
to open in new tabs, we should be adding rel="noopener" to the links as
well, for security reasons:
https://mathiasbynens.github.io/rel-noopener/
Yet another issue with Bleach 3.0's linkification: when used via the
filter as part of sanitization (which is necessary right now due to
*another* issue where it escapes valid HTML tags), it doesn't properly
linkify urls that contain an ampersand.
As a (temporary?) workaround, this stops using Bleach's linkification
entirely and switches to cmark-gfm's "autolink" extension. These aren't
perfectly equivalent, and the switch results in two other issues that I
consider more minor than links including ampersands not working:
- autolink will initially create links for ftp:// urls and email
addresses. The final sanitization will remove these links due to the
protocol whitelist, but it will leave behind a bare <a> tag. So the
text will *appear* linked but not actually link to anything. If I
decide to stick with autolink, it should be pretty straightforward to
fix this by stripping all bare <a> tags from the final HTML.
- autolink doesn't create links for bare domains. For example, writing
"example.com" won't result in a link, it's necessary to write
"www.example.com" or include a protocol like "http://example.com".
This doesn't really change anything functionally, but it gives users
specific links that they can send out that will pre-fill the invite code
on the registration page, instead of requiring them to copy-paste it or
type it in themselves.
No functional difference, but probably a little better to use default
values where possible instead of specifying particular ways of setting
it back to default (especially for things like topic_type).
This shouldn't make much difference in practice since deleted_time
should be updated automatically, but it's probably safest to make sure
that the posts are still deleted before cleaning them up, in case the
deleted_time ends up staying set somehow.
Some new fields have been added to comments and topics recently
(excerpt, original_url), but they weren't added to the cleanup script
yet. There was also no reason to keep information about whether the
posts had been edited or not.
As of version 3.0, the redis-py package no longer has a distinction
between its Redis and StrictRedis classes, and both behave the same
(StrictRedis is just an alias for Redis).
This would have continued working as-is, but we might as well switch it
back to the normal name now that StrictRedis doesn't have any benefit.
Pyramid 1.10 adds config.route_prefix_context(), which can be used like
this to organize routes under a prefix a little more easily. In the next
release it should also be possible to move the route for the prefix
itself inside the context manager with an empty pattern and an
inherit_slash=True argument.
See https://github.com/Pylons/pyramid/pull/3420
There are a few significant updates in here (Pyramid 1.10, redis 3.0,
etc.) but they don't have any major effects, so this should all be safe
and I can do some additional updates in follow-up commits.
Until now, if people want their account deleted, I've just been banning
it. This will let me do it properly, but some additional cleanup should
be added in the future once I think through what's safe to get rid of
from deleted accounts.
cmark-gfm's behavior seems to have changed when I upgraded it, and it's
now producing " entities instead of normal double-quote characters.
This was breaking syntax-highlighting, so this simply replaces all the
entities back to normal double-quotes. This is maybe a little risky to
do without doing it as part of proper HTML parsing, but I think it
should overall be pretty safe.