1726 Commits

Author SHA1 Message Date
Clément Renault
883a8109c8
Show both database and documents database sizes 2020-08-10 14:37:18 +02:00
Clément Renault
a4e0f3f724
Remove the useless TransitiveArc from the serve binary 2020-08-10 14:06:27 +02:00
Clément Renault
edc06a97d6
Remove the useless stats binary 2020-08-10 13:55:02 +02:00
Clément Renault
ae77fe5a69
Introduce an option to specify the maximum database size 2020-08-10 13:53:53 +02:00
Clément Renault
394844062f
Move the documents MTBL database inside the Index 2020-08-10 13:47:19 +02:00
Clément Renault
ecd2b2f217
Make the final merge done in parallel 2020-08-07 15:44:04 +02:00
Clément Renault
91282c8b6a
Move the documents into another file 2020-08-07 13:11:31 +02:00
Clément Renault
fae694a102
Put the documents into an MTBL database 2020-08-07 12:14:40 +02:00
Clément Renault
d5a356902a
Update oxidized-mtbl 2020-08-07 12:14:03 +02:00
Clément Renault
405a71d3a4
Accept csv from stdin 2020-08-06 13:38:21 +02:00
Clément Renault
d3b1096510
Compute the word attribute postings lists on each threads 2020-08-06 11:50:27 +02:00
Clément Renault
8d734941af
Clean up some lines 2020-08-06 10:20:26 +02:00
Clément Renault
a4e3c7c37c
Force the Papa parse delimiter 2020-08-05 14:11:46 +02:00
Clément Renault
6508d497ce
Replace the regex highlighting by a simple algorithm 2020-08-05 13:52:27 +02:00
Clément Renault
4873abe145
Introduce option flags to toggle the indexing engine 2020-08-05 12:10:41 +02:00
Clément Renault
bd4b18541c
Introduce a new indexer which uses an MTBL sorter 2020-08-04 15:44:37 +02:00
Clément Renault
3f21760d56
Update README.md 2020-08-04 15:40:37 +02:00
Clément Renault
bc3a0ac6a3
Display the milli logo and update the description 2020-08-04 15:40:02 +02:00
Kerollmops
d7d8f38fb7
Update bulma to spread the logo more 2020-07-16 23:45:02 +02:00
Kerollmops
ee305c9284
Replace the title by the milli logo 2020-07-15 23:55:28 +02:00
Kerollmops
9ade00e27b
Highlight all the matching words 2020-07-14 11:53:21 +02:00
Kerollmops
085c376655
Use the regex crate to highlight "hello" 2020-07-14 11:28:40 +02:00
Kerollmops
dd385ad05b
Customize the mark tag css 2020-07-14 11:03:21 +02:00
Kerollmops
aa92311d4e
Add a dark theme to the dashboard 2020-07-13 23:51:41 +02:00
Kerollmops
3d144e62c4
Search for best proximities in multiple attributes 2020-07-13 19:06:56 +02:00
Kerollmops
576dd011a1
Compute the candidates but not by attribute 2020-07-13 18:16:05 +02:00
Kerollmops
6b14b20369
Introduce a method to retrieve the number of attributes of the documents 2020-07-13 17:50:16 +02:00
Kerollmops
54afec58a3
Add a fade in out animation when the server process 2020-07-12 11:34:48 +02:00
Kerollmops
92c2b1dd2d
Refine the help message of the binaries 2020-07-12 11:06:45 +02:00
Kerollmops
f757df5dfd
Introduce the stderr logger to the project 2020-07-12 11:04:35 +02:00
Kerollmops
12358476da
Use the log crate instead of stderr 2020-07-12 10:55:09 +02:00
Kerollmops
2c62eeea3c
Rename the project milli 2020-07-12 00:16:41 +02:00
Kerollmops
d31da26a51
Avoid cloning RoraringBitmaps when unecessary 2020-07-11 23:51:32 +02:00
Kerollmops
b8a1fc0126
Clean up the CSS style custom bulma rules 2020-07-11 14:51:59 +02:00
Kerollmops
f6eae91c7d
Pretty print the new dashboard numbers 2020-07-11 14:17:37 +02:00
Kerollmops
d44428fa90
Display more informations on the dashboard 2020-07-11 11:51:56 +02:00
Kerollmops
11c7fef80a
Implement a memory dumper
It moves the in memory HashMaps used when indexing to a disk based MTBL file
2020-07-07 16:48:49 +02:00
Kerollmops
b12bfcb03b
Reduce the deepness of the word position document ids
This helps reduce the number of allocations.
2020-07-07 12:30:05 +02:00
Kerollmops
7178b6c2c4
First basic version using MTBL again 2020-07-07 11:32:33 +02:00
Kerollmops
45d0d7c3d4
Clean up the README 2020-07-06 17:38:22 +02:00
Kerollmops
adb1038b26
Add a jobs parameter to set the number of threads the indexer uses 2020-07-06 12:17:17 +02:00
Kerollmops
2a3b03138b
Use heed 0.8.1 with the RwIter append method 2020-07-05 19:50:28 +02:00
Kerollmops
ec1023e790
Intersect document ids by inverse popularity of the words
This reduces the worst request we had which took 56s to now took 3s ("the best of the do").
2020-07-05 19:33:51 +02:00
Kerollmops
cd7e64b2b3
Allow users to set the arc cache size when indexing 2020-07-04 18:12:41 +02:00
Kerollmops
ac8353a64f
Merge pre-computed word attribute documents ids 2020-07-04 17:02:27 +02:00
Kerollmops
fea7cac206
Display the time it took to compute the word attribute documents ids 2020-07-04 15:18:38 +02:00
Kerollmops
46ced5c828
Introduce the RwIter append heed API 2020-07-04 12:34:10 +02:00
Kerollmops
7e7440c431
Finalize the LMDB indexing design 2020-07-01 22:45:43 +02:00
Kerollmops
2ae3f40971
Make the indexer ignore certain words
This is a preparation for making the indexing fully parallel by making the
indexer only be aware of certain words for each threads to avoid postings lists
conflicts for each words
2020-07-01 17:49:46 +02:00
Kerollmops
a3ac2623d5
Introduce multiple functions to clean up the code 2020-07-01 17:24:55 +02:00