Commit Graph

15 Commits (7c6fef81a4db9bf2d51aa95ee54498eaf0f0f0fa)

Author SHA1 Message Date
Eugen Rochko 45b759a2ca Fix hashtags being split by ZWNJ character (#11821)
Fix #11761
2019-09-13 16:01:26 +02:00
Eugen Rochko 848cb2a697 Add more accurate hashtag search (#11579)
* Add more accurate hashtag search

Using ElasticSearch to index hashtags with edge n-grams and score
them by usage within the last 7 days since last activity. Only
hashtags that have been reviewed and are listable can appear in
searches, unless they match the query exactly

* Fix search analyzer dropping non-ascii characters
2019-08-18 03:45:51 +02:00
Eugen Rochko b92e18080a Change hashtag search to only return results that have trended in the past (#11448)
* Change hashtag search to only return results that have trended in the past

A way to eliminate typos and other one-off "junk" results

* Fix excluding exact matches that don't have a score

* Fix tests
2019-07-30 20:29:50 +02:00
Eugen Rochko c111bd01a4 Fix tag normalization and migration not removing duplicate tags (#11441)
Fix #11428
2019-07-29 20:40:21 +02:00
ThibG 6d5f00fdfe Disallow numeric-only hashtags (#11363)
* Add spec covering numeric-only hashtags

* Fix hashtag regex
2019-07-19 23:22:35 +02:00
Eugen Rochko 1d560713b6 Fix only one middle dot being recognized in hashtags (#11345)
Fix #10934
2019-07-18 03:02:56 +02:00
ysksn 2e5aec4479 Add a test for Tag#to_param (#5705) 2017-11-15 16:04:41 +01:00
masarakki d0a037ae79 add validation to tag name (#4194) 2017-07-14 11:02:49 +02:00
unarist ae26d7b557 Make tag search case insensitive again (#4184) 2017-07-13 19:31:33 +02:00
Eugen Rochko 8fb81e1372 Add important test for full-width hashtags (#3911) 2017-06-23 17:01:53 +02:00
unarist 117d333a84 Fix tag search order and not to use tsvector (#3611)
* Sort results by the name
* Switch search method to simple `LIKE` matching instead of tsvector/tsquery

Previously we used scores from ts_rank_cd() to sort results, but it didn't work
because the function returns same score for all results. It's not for calculate
similarity of single words. Sometimes this bug even push out exact matching tag
from results.

Additionally, PostgreSQL supports prefix searching with standard btree index.
Using it offers simpler code, but also less index size and some speed.
2017-06-06 16:07:06 +02:00
Matt Jankowski 641e809eaf Search cleanup (#1333)
* Clean up SQL output in Tag and Account search methods

* Add basic coverage for Tag.search_for

* Add coverage for Account.search_for

* Add coverage for Account.advanced_search_for
2017-04-09 14:45:01 +02:00
Eugen Rochko 9adf6e8736 Fix wrongful matching of last period in extended usernames
Fix anchor tags in some wikipedia URLs being matches as a hashtag
2017-03-05 18:08:19 +01:00
Eugen Rochko 1b61e404b4 Localizations for most server-side strings 2016-11-16 00:55:33 +01:00
Eugen Rochko 082e57fc13 Adding hashtag model 2016-11-04 19:12:59 +01:00