Synonyms needs to be indexed in ascendant order,
and the new normalization step for synonyms potentially changes this order
which break the indexation process
because "Harry Potter" > "HP" but "harry potter" < "hp"
1172: Fix atomic snapshot creation r=MarinPostma a=raszi
Compress gzip files to a temporary file first and then do an atomic rename.
In our setup we have an indexer which does snapshoting for the instances serving the requests. Since currently the snapshoting mechanism is replacing the file in place therefore the indexer could not share the snapshot with a live instance.
With this small patch we first create a new temporary file in the same directory as the snapshot dir and then we do an atomic rename therefore the snapshot path would always contain a valid snapshot.
After applying this change it would be enough to simply restart the serving instances to pick up the new snapshot from a shared storage without worrying them to die because of an incomplete snapshot.
Co-authored-by: KARASZI István <ikaraszi@gmail.com>
1176: fix race condition in document addition r=Kerollmops a=MarinPostma
As described in #1160, there was a race condition when updating settings and adding documents simultaneously. This was due to the schema being updated and document addition being processed in two different transactions. This PR moves the schema update logic for the primary key in the same transaction as the document addition, while maintaining the input checks for the validity of the primary key in the http route, in order not to break the error reporting for the document addition route.
close#1160.
Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: marin <postma.marin@protonmail.com>
Add `the update_id` to the to the updates. The rationale is the
following:
- It allows for better tracability of the update events, thus improved
debugging and logging.
- The enigne is now aware of what he's already processed, and can return
it if asked. It may not make sense now, but in the future, the update
store may not work the same way, and this information about the state
of the engine will be desirable (distributed environement).
1184: normalize synonyms during indexation r=MarinPostma a=LegendreM
fix#1135#964
Normalizes the synonyms before indexing them, so they are not case sensitive anymore. Then normalization also involves deunicoding is some cases, such as accents, so `été` and `ete` are considered equivalent in a search for synonyms.
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
1174: Limit query words number r=MarinPostma a=MarinPostma
This pr adds a limit to the number of words taken into account in a search query. Using query string that are too long leads to huge performance hits and ressources consumtion, that occasionally crashes the machine. The limit has been hard set to 10, and tests have been added to make sure that it is taken into account.
close#941
Co-authored-by: mpostma <postma.marin@protonmail.com>
1207: fix homebrew name r=MarinPostma a=fharper
brew is the command, the package manager name is homebrew
Co-authored-by: Frédéric Harper <hi@fred.dev>