Commit Graph

61 Commits (9ea7cc70a3611395f875d54b9f4fe24c3ca5b55f)

Author SHA1 Message Date
Eugen Rochko 4dc87ffc06 Add support for structured data and more OpenGraph tags to link cards (#16938)
Save preview cards under their canonical URL

Increase max redirects to follow from 2 to 3
2021-11-05 23:23:05 +01:00
Claire 45903ae80a Fix some RedisLocks auto-releasing too fast (#16276)
* Fix Delete and Create-related locks expiring too fast

Fixes #16238

By default, RedisLock expires after 10 seconds, which may not be enough to
process statuses, especially when those have attached media files.

This commit extends those 10 seconds to 15 minutes, which should be plenty
enough to handle any status, while being short enough to not waste many
sidekiq job retries in the exceedingly rare case in which a sidekiq process
would crash when processing a `Create` or `Delete`.

* Fix other RedisLock autorelease durations

Fixes #15645

- things that only perform a few simple database queries (e.g. finding and
  saving a record) have been left unchanged, so they'll still use the default
  10s duration
- things that perform significantly more complex database queries have been
  changed to a 5 minutes timeout
- things that perform multiple HTTP queries have been changed to a 15 minutes
  timeout
2021-05-19 23:52:08 +02:00
Claire 53d99e7426 Fix URL scanning in note length validator and preview card fetching (#15827)
* Add tests

* Fix URL scanning in note length validator and preview card fetching
2021-03-04 00:12:26 +01:00
Claire a33f8f787a Update twitter-text from 1.14 to 3.1.0 and fix toot character counting (#15382)
* Update twitter-text from 1.14 to 3.1.0

* Disable emoji parsing

* Properly depend on twitter-text for url detection

* Fix some URLs being wrongly detected client-side

* Add test for server-side validation of non-autolinkable URLs

* Fix server-side status length counting
2021-03-02 12:02:56 +01:00
Takeshi Umeda 3830dbc6fc Fix first return value of FetchLinkCardService.html method (#15630) 2021-01-25 09:22:41 +01:00
luigi 944b059f50 Optimize map { ... }.compact calls (#15513)
* Optimize map { ... }.compact

using Enumerable#filter_map, supported since Ruby 2.7

* Add poyfill for Enumerable#filter_map
2021-01-10 00:32:01 +01:00
dependabot[bot] 61b768572e Bump rubocop from 0.86.0 to 0.88.0 (#14412)
* Bump rubocop from 0.86.0 to 0.88.0

Bumps [rubocop](https://github.com/rubocop-hq/rubocop) from 0.86.0 to 0.88.0.
- [Release notes](https://github.com/rubocop-hq/rubocop/releases)
- [Changelog](https://github.com/rubocop-hq/rubocop/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rubocop-hq/rubocop/compare/v0.86.0...v0.88.0)

Signed-off-by: dependabot[bot] <support@github.com>

* Fix for latest RuboCop

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yamagishi Kazutoshi <ykzts@desire.sh>
2020-09-01 03:04:00 +02:00
Eugen Rochko a0c1bbf583 Change User-Agent of link preview fetching service to include "Bot" (#14248)
This forces Twitter to render OpenGraph tags in the response
2020-07-07 10:55:18 +02:00
ThibG fc94f3bd12 Fix link crawler not specifying accepted content-type (#12646)
The link crawler expects HTML documents, so set the `Accept`
header accordingly.

Fixes #12618
2019-12-18 16:56:06 +01:00
ThibG 32ef133fa6 Fix FetchLinkCardServices crashing on a tags without a target (#12159)
* Add test for links without targets

* Fix FetchLinkCardServices crashing on a tags without a target
2019-11-21 16:04:52 +01:00
Eugen Rochko 357a2e5564 Add cache for OEmbed endpoints to avoid extra HTTP requests (#12403)
* add youtube oembed endpoint

* add check for oembed endpoint

* change unless for a more readable if

* clear blank lines

* endpoint via https

* Fix string literal in condition

* use cache for endpoints

* use cache for endpoints

* clean up and adding check

* clean up and remove redundant return

* add html check

* add false to return

* use double quotes

* use double quotes

* Clean up
2019-11-17 18:40:33 +01:00
nightpool d4ddfc486a microformat mentions can have an implicit property (#12189)
See the first example here: http://microformats.org/wiki/microformats2#hyperlinked_person
2019-10-24 22:46:15 +02:00
Eugen Rochko 67713f8b08 Remove HEAD request from fetching link previews (#12028)
It is not really necessary and we need to reduce requests
2019-10-01 04:54:10 +02:00
Eugen Rochko 386dc65671 Fix preview card image not being re-fetched even if link is re-posted (#11981)
Fix #11956
2019-09-28 01:33:16 +02:00
Eugen Rochko 6baf5099a6 Refactor fetching of remote resources (#11251) 2019-07-10 18:59:28 +02:00
Eugen Rochko fbbcbd940d Remove Atom feeds and old URLs in the form of `GET /:username/updates/:id` (#11247) 2019-07-07 16:16:51 +02:00
Daniel Aleksandersen e3dda38c8b Treat meta[property] as a space-separated list (#10604)
The @property attribute in HTML is a space-separated list of values.
This change normalizes whitespace and finds the desired value in
the list instead of requiring an exact single-value match.

More details:
https://www.ctrl.blog/entry/rdfa-socialmedia-metadata.html
2019-04-21 04:48:19 +02:00
ThibG f76665a276 Ignore low-confidence CharlockHolmes guesses when parsing link cards (#9510)
* Add failing test for windows-1251 link cards

* Ignore low-confidence CharlockHolmes guesses

Fixes #9466

* Fix no method error when charlock holmes cannot detect charset
2018-12-17 19:19:45 +01:00
ThibG 49889c0b8b Check that twitter:player is valid before using it (#9254)
Fixes #9251
2018-11-10 20:42:04 +01:00
ThibG cfe92b50bb Fix Pleroma mentions being fetched as preview cards (#9158) 2018-10-30 15:02:24 +01:00
Eugen Rochko cf2ab9c394 Include preview cards in status entity in REST API (#9120)
* Include preview cards in status entity in REST API

* Display preview card in-stream

* Improve in-stream display of preview cards
2018-10-28 06:35:03 +01:00
ThibG 8d76db2714 Do not fetch preview card for mentioned users (#6934) 2018-10-25 18:13:19 +02:00
Eugen Rochko d3105031f8 Redesign forms, verify link ownership with rel="me" (#8703)
* Verify link ownership with rel="me"

* Add explanation about verification to UI

* Perform link verifications

* Add click-to-copy widget for verification HTML

* Redesign edit profile page

* Redesign forms

* Improve responsive design of settings pages

* Restore landing page sign-up form

* Fix typo

* Support <link> tags, add spec

* Fix links not being verified on first discovery and passive updates
2018-09-18 16:45:58 +02:00
ThibG 441238b938 Handle relative URLs when fetching OEmbed/OpenGraph cards (#8669) 2018-09-10 18:26:28 +02:00
Yamagishi Kazutoshi 1a145c6af1 Skip processing when HEAD method returns 501 (#7730) 2018-06-04 13:42:53 +02:00
Akihiko Odaki 5dadb6896b Raise Mastodon::RaceConditionError if Redis lock failed (#7511)
An explicit error allows user agents to know the error and Sidekiq to
retry.
2018-05-16 12:29:45 +02:00
Yamagishi Kazutoshi 6092325a48 Rescue Mastodon::LengthValidationError in FetchLinkCardService (#7424) 2018-05-09 08:39:08 +02:00
Eugen Rochko ca1c696dbd Slightly reduce RAM usage (#7301)
* No need to re-require sidekiq plugins, they are required via Gemfile

* Add derailed_benchmarks tool, no need to require TTY gems in Gemfile

* Replace ruby-oembed with FetchOEmbedService

Reduce startup by 45382 allocated objects

* Remove preloaded JSON-LD in favour of caching HTTP responses

Reduce boot RAM by about 6 MiB

* Fix tests

* Fix test suite by stubbing out JSON-LD contexts
2018-05-02 18:58:48 +02:00
Akihiko Odaki acece7a2e6 Validate HTTP response length while receiving (#6891)
to_s method of HTTP::Response keeps blocking while it receives the whole
content, no matter how it is big. This means it may waste time to receive
unacceptably large files. It may also consume memory and disk in the
process. This solves the inefficency by checking response length while
receiving.
2018-03-26 14:02:10 +02:00
Akihiko Odaki 11c19f6cc9 Close http connection in perform method of Request class (#6889)
HTTP connections must be explicitly closed in many cases, and letting
perform method close connections makes its callers less redundant and
prevent them from forgetting to close connections.
2018-03-24 12:49:54 +01:00
Eugen Rochko 0357e93a91 Fix #5173: Click card to embed external content (#6471) 2018-02-15 07:04:28 +01:00
abcang b209de40f4 Fix saving of oEmbed image (#6409) 2018-02-02 11:57:59 +01:00
Eugen Rochko 8c04f9417f Fix redundant HTTP request in FetchLinkCardService (#6002) 2017-12-13 12:15:28 +01:00
Akihiko Odaki d67575edea Store preview image for embedded photo in preview cards (#5986)
The preview image would be useful to embed in timeline.
2017-12-12 15:54:38 +01:00
Renato "Lond" Cerqueira 15bc3398f7 Return false if object does not respond to url (#5988)
Avoid error when the service returns a mostly valid oembed, but has no
url in it, causing a MethodError: undefined method `url'
for #<OEmbed::Response::Photo:0x000056505def9620>
2017-12-12 15:12:09 +01:00
Yamagishi Kazutoshi f54ca825c5 Ignore HEAD method if does not support (#5949) 2017-12-09 16:53:40 +01:00
Akihiko Odaki ce3989fc6a Add embed_url to preview cards (#5775) 2017-12-07 03:37:43 +01:00
abcang 4e409e629d Fixed duplicating URL of photo type of oEmbed (#5763) 2017-11-20 20:45:54 +01:00
unarist 8b2ee20dfa Don't capture scheme-less URLs in the status (#5435)
Specifically, this fixes status length calculation to be same as JS side.

BTW, since this pattern used in not only preview card fetching, we
should extract it (with twitter-regex?) and write tests I think.
2017-10-17 18:32:25 +02:00
unarist dead33f113 Fix some failure cases on FetchLinkCardService (#5347)
* If OEmbed response doesn't have a required property `type`, ignore it.
  e.g. `NoMethodError: undefined method 'type' for ...`
* If we failed to detect encoding, fallback to default behavior of Nokogiri.
  e.g. `KeyError: key not found: :encoding`
2017-10-12 12:01:32 +02:00
unarist 4acb1c73dd Improve error handling on LinkCrawlWorker (#5250)
* Improve error handling on LinkCrawlWorker

* Ignore TimeoutError and InvalidURIError too
* Record errors to debug log
* Enable dead job queue on LinkCrawlWorker

Since most of acceptable errors were already ignored, only our side issue should go to dead job queue.

* Ignore all http gem errors
2017-10-06 20:39:08 +02:00
ふぁぼ原 c71727ca55 Enable to recognize most kinds of characters as URL paths (#4941) 2017-09-14 18:03:20 +02:00
Eugen Rochko 41d6ada41f Support OpenGraph video embeds (#4897)
* Support OpenGraph video embeds

It's not really OpenGraph, it's twitter:player property, but it's
not OEmbed so that fits. For example, this allows Twitch clips to
be displayed as embeds.

Also, fixes glitch-soc/mastodon#135

* Fix invalid OpenGraph cards being saved through attaching and
revisit URLs after 14 days
2017-09-14 04:11:36 +02:00
Eugen Rochko e9e271878e Make PreviewCard records reuseable between statuses (#4642)
* Make PreviewCard records reuseable between statuses

**Warning!** Migration truncates preview_cards tablec

* Allow a wider thumbnail for link preview, display it in horizontal layout (#4648)

* Delete preview cards files before truncating

* Rename old table instead of truncating it

* Add mastodon:maintenance:remove_deprecated_preview_cards

* Ignore deprecated_preview_cards in schema definition

* Fix null behaviour
2017-09-01 16:20:16 +02:00
Eugen Rochko c5fa4aba91 HTTP signatures (#4146)
* Add Request class with HTTP signature generator

Spec: https://tools.ietf.org/html/draft-cavage-http-signatures-06

* Add HTTP signature verification concern

* Add test for SignatureVerification concern

* Add basic test for Request class

* Make PuSH subscribe/unsubscribe requests use new Request class

Accidentally fix lease_seconds not being set and sent properly, and
change the new minimum subscription duration to 1 day

* Make all PuSH workers use new Request class

* Make Salmon sender use new Request class

* Make FetchLinkService use new Request class

* Make FetchAtomService use the new Request class

* Make Remotable use the new Request class

* Make ResolveRemoteAccountService use the new Request class

* Add more tests

* Allow +-30 seconds window for signed request to remain valid

* Disable time window validation for signed requests, restore 7 days
as PuSH subscription duration (which was previous default due to a bug)
2017-07-14 20:41:49 +02:00
nullkal 07024f56df Use charlock_holmes instead of nkf at FetchLinkCardService (#4080)
* Specs for language detection

* Use CharlockHolmes instead of NKF

* Correct mistakes

* Correct style

* Set hint_enc instead of falling back and strip_tags

* Improve specs

* Add dependencies
2017-07-08 22:44:31 +02:00
abcang 8041c97d52 Fix Nokogiri::HTML at FetchLinkCardService (#4072) 2017-07-05 14:54:21 +02:00
abcang 43d97dea48 Rescue exceptions caused by FetchLinkCardService (#4045) 2017-07-03 11:03:34 +02:00
ThibG 3af5774a71 Fix an error when TagManager.local_url? is called with a bad URI (#3701)
TagManager.local_url? was sometimes called with an URI with a nil host,
leading to a crash in TagManager.local_url?. This fixes moves the
already-existing uri.host.blank? check in front to avoid this case.
2017-06-11 22:53:12 +02:00
Yamagishi Kazutoshi bd1f7d0b9c Fetch remote image using http.rb (#3114) 2017-05-18 15:43:10 +02:00