Commit graph

28 commits

Author SHA1 Message Date
abp
c33591eaca instantiate and import spam classifier lazily
Co-authored-by: das <das@tutao.de>
2025-11-18 17:10:44 +01:00
map
5293be6a4a
Implement spam training data sync and add TutanotaModelV98
We sync the spam training data encrypted through our server to make
sure that all clients for a specific user behave the same when
classifying mails. Additionally, this enables the spam classification
in the webApp. We compress the training data vectors
(see clientSpamTrainingDatum) before uploading to our server using
SparseVectorCompressor.ts. When a user has the ClientSpamClassification
enabled, the spam training data sync will happen for every mail
received.

ClientSpamTrainingDatum are not stored in the CacheStorage.
No entityEvents are emitted for this type.
However, we retrieve creations and updates for ClientSpamTrainingData
through the modifiedClientSpamTrainingDataIndex.

We calculate a threshold per classifier based on the dataset ham to spam
ratio, we also subsample our training data to cap the ham to spam ratio
within a certain limit.

Co-authored-by: jomapp <17314077+jomapp@users.noreply.github.com>
Co-authored-by: das <das@tutao.de>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
Co-authored-by: sug <sug@tutao.de>
Co-authored-by: nif <nif@tutao.de>
Co-authored-by: map <mpfau@users.noreply.github.com>
2025-11-18 13:56:19 +01:00
sug
f11e59672e
improve inbox rule handling and run spam prediction after inbox rules
Instead of applying inbox rules based on the unread mail state in the
inbox folder, we introduce the new ProcessingState enum on
the mail type. If a mail has been processed by the leader client, which
is checking for matching inbox rules, the ProcessingState is
updated. If there is a matching rule the flag is updated through the
MoveMailService, if there is no matching rule, the flag is updated
using the ClientClassifierResultService. Both requests are
throttled / debounced. After processing inbox rules, spam prediction
is conducted for mails that have not yet been moved by an inbox rule.
The ProcessingState for not matching ham mails is also updated using
the ClientClassifierResultService.

This new inbox rule handing solves the following two problems:
 - when clicking on a notification it could still happen,
   that sometimes the inbox rules where not applied
 - when the inbox folder had a lot of unread mails, the loading time did
   massively increase, since inbox rules were re-applied on every load

Co-authored-by: amm <amm@tutao.de>
Co-authored-by: Nick <nif@tutao.de>
Co-authored-by: das <das@tutao.de>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
2025-10-22 09:40:45 +02:00
das
fd22294a18
[antispam] Add client-side local spam filtering
Implement a local machine learning model for client-side spam filtering.
The local model is implemented using tensorflow "LayersModel" to train
separate models in all available mailboxes, resulting in one model
per ownerGroup (i.e. mailbox).

Initially, the training data is aggregated from the last 30 days of
received mails, and the data is stored in a separate offline database
table named spam_classification_training_data. The trained model is
stored in the table spam_classification_model. The initial training
starts after indexing, with periodic training happening
every 30 minutes and on each subsequent login.

The model will predict on incoming mails once we have received the
entity event for said mail, moving it to either inbox or spam folder.
When users move mails, we update the training data labels accordingly,
by adjusting the isSpam classification and isSpamConfidence values in
the offline database. The MoveMailService now contains a moveReason,
which indicates that the mail has been moved by our spam filter.

Client-side spam filtering can be activated using the
SpamClientClassification feature flag, and is for now only
available on the desktop client.

Co-authored-by: sug <sug@tutao.de>
Co-authored-by: kib <104761667+kibibytium@users.noreply.github.com>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: frm <frm@tutao.de>
Co-authored-by: das <das@tutao.de>
Co-authored-by: nif <nif@tutao.de>
Co-authored-by: amm <amm@tutao.de>
2025-10-22 09:25:20 +02:00
ivk
83a81c31b8 Fix incorrect imports from @tutao/tutanota-utils 2025-09-30 16:40:14 +02:00
abp
f44614517d
update range in storage only with instances without decryption errors
Co-authored-by: das <das@tutao.de>
Co-authored-by: jomapp <17314077+jomapp@users.noreply.github.com>
Co-authored-by: abp <abp@tutao.de>
2025-08-14 14:54:10 +02:00
abp
66a4bd0298
do not put instance into cache if it has errors
Co-authored-by: das <das@tutao.de>
2025-08-14 14:54:09 +02:00
abp
b61d6c2abd
do not throw when attempting to put instances with errors to cache 2025-07-29 10:09:23 +02:00
abp
fbceb21eb4
do not throw ProgrammingError in case of patch mismatch for File type
We do not throw a ProgrammingError when there is a patch mismatch for
type tutanota/File. Additionally, we are not storing instances with
errors to the offline storage and are throwing a ProgrammingError.
2025-07-24 12:28:16 +02:00
map
b81b17b4b9
write multiple instances at once to the offline db 2025-07-02 14:00:11 +02:00
ivk
ae9226204b Resolve circular dependency for DefaultEntityRestCache 2025-06-04 11:18:48 +02:00
bir
0041497d23 Ensure events are not missed
Have indexer apply updates before batch is completed
to ensure that we have an up-to-date index (or we will
if updating fails and we retry).

Rename CustomCacheHandler's methods to avoid confusion.

Close #8902

Co-authored-by: paw <paw-hub@users.noreply.github.com>
2025-06-04 10:36:46 +02:00
ivk
86d5775e16 Add SQLite search on clients where offline storage is available
- Introduce a separate Indexer for SQLite using FTS5
- Split search backends and use the right one based on client (IndexedDB
  for Browser, and OfflineStorage everywhere else)
- Split SearchFacade into two implementations
- Adds a table for storing unindexed metadata for mails
- Escape special character for SQLite search
  To escape special characters from fts5 syntax. However, simply
  surrounding each token in quotes is sufficient to do this.
  See section 3.1 "FTS5 Strings" here: https://www.sqlite.org/fts5.html
  which states that a string may be specified by surrounding it in
  quotes, and that special string requirements only exist for strings
  that are not in quotes.
- Add EncryptedDbWrapper
- Simplify out of sync logic in IndexedDbIndexer
- Fix deadlock when initializing IndexedDbIndexer
- Cleanup indexedDb index when migrating to offline storage index
- Pass contactSuggestionFacade to IndexedDbSearchFacade
    The only suggestion facade used by IndexedDbSearchFacade was the
    contact suggestion facade. So we made it clearer.
- Remove IndexerCore stats
- Split custom cache handlers into separate files
  We were already doing this with user, so we should do this with the
  other entity types.

- Rewrite IndexedDb tests
- Add OfflineStorage indexer tests
- Add custom cache handlers tests to OfflineStorageTest
- Add tests for custom cache handlers with ephemeral storage
- Use dbStub instead of dbMock in IndexedDbIndexerTest
- Replace spy with testdouble in IndexedDbIndexerTest

Close #8550

Co-authored-by: ivk <ivk@tutao.de>
Co-authored-by: paw <paw-hub@users.noreply.github.com>
Co-authored-by: wrd <wrd@tutao.de>
Co-authored-by: bir <bir@tutao.de>
Co-authored-by: hrb-hub <hrb-hub@users.noreply.github.com>
2025-06-04 10:36:46 +02:00
ivk
9e31ee0409 Inject type model resolvers
Passing instances explicitly avoids the situations where some of them
might not be initialized.

We also simplified the entity handling by converting entity updates to
data with resolved types early so that the listening code doesn't have
to deal with it.

We did fix some of the bad test practices, e.g. setting/restoring env
incorrectly. This matters now because accessors for type model
initializers check env.mode.

Co-authored-by: paw <paw-hub@users.noreply.github.com>
2025-05-27 14:52:44 +02:00
abp
0ecba0aa6e remove locks from the range database
Co-authored-by: map <mpfau@users.noreply.github.com>
2025-05-27 14:37:09 +02:00
map
ec92701e41 support tracing deadlocks / unresolved promises
use factory method instead of Promise constructor and log
if a promise is not fulfilled or rejected within 60s.
2025-05-27 14:37:09 +02:00
mac-github
279278179b store ServerModelParsedInstances in offline db
Co-authored-by: nig <nig@tutao.de>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: das <das@tutao.de>
2025-05-06 18:45:27 +02:00
abp
f398d8ef6f
fix incorrect model usage when reading from the cache storage
We change the usages of resolveServerTypeReference to its client
counterpart in all cache functions except for put, as using the server
type reference only makes sense when putting parsed entities in the
cache storage.

Co-authored-by: jomapp <17314077+jomapp@users.noreply.github.com>
2025-04-29 14:13:13 +02:00
Kinan
edbf281b88
switch to typeIds and attrIds, add SystemMV126, TutanotaMV86, BaseMV2
Refactor our instance deserialization/serialization pipeline, both on
TypeScript and on Rust [sdk] to use typeId and attributeIds instead of
typeNames and attributeNames. We furthermore ignore cardinalities
on associations until the instance layer and always
store associations as arrays. This commit introduces **eventual
consistency** on the client, i.e. we are from now on always storing data
in the newest schema format (activeApplicationVersionsForWritingSum)
which ensures that all data is already available on the client after
updating the client to a newer version. This removes the need for
offline migrations on the client and also removes backward migrations
on the server. Furthermore, the server model types are now available
on the client, retrievable through the ApplicationTypesFacade. This is
our first step towards FastSync.

Co-authored-by: nig <nig@tutao.de>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: jomapp <17314077+jomapp@users.noreply.github.com>
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: sug <sug@tutao.de>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
2025-04-28 12:44:35 +02:00
paw
8c1ed56072
update eslint to v9
Co-authored-by: paw-hub <paw-hub@users.noreply.github.com>
2025-01-09 14:55:11 +01:00
nig
cf42135be8 add mail import
eml and mbox file import in desktop client
pause/resume functionality
chunking for mails and attachments to reduce server requests
testing for imap import source

wisely designed in the alps

Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
Co-authored-by: kitsugo <hayashi.jiro@kitsugo.com>
Co-authored-by: nif <nif@tutao.de>
Co-authored-by: sug <sug@tutao.de>
2025-01-02 16:52:08 +01:00
sug
dfe1dddb95 improve mail set migration performance
to avoid excessive entity updates and inconsistent offline storages,
we don't send entity updates for each mail set migrated mail.
instead we detect the mail set migration for each folder and drop
its whole mail list from the offline cache.

we could fix up the database when we receive the changed folder,
but that would involve re-doing the migration locally and will
lead to very long entity event handling that might get interrupted,
breaking the offline database anyway.
2024-09-09 15:54:04 +02:00
jhm
2b87bef35a align customId handling of EphemeralCacheStorage with OfflineStorage
To sort customIds within the offline database we store customIds
as base64Ext id strings in the offline storage. We want to align
the EphemeralCacheStorage to behave the same, and likewise store
customIds with base64Ext encoding.
2024-08-29 10:15:08 +02:00
map
b0dd4225e3 delete MailSetEntries from the offline database in clearExcludedData #7429
additionally, add test for storing instance with CustomId OfflineStorage
2024-08-27 17:16:27 +02:00
nig
9fc3669a61 make OfflineStorage use base64Ext for storing customIds #7429
we now store custom id entities in the offline storage, which means we need to
make sure storing ranges and comparing ids works for them. in order to achieve
that, we decided to store the normally base64Url-encoded, not lexicographically
sortable ids in the sortable base64Ext format.

the Offline Storage needs to use the "converted" base64Ext ids internally everywhere
for custom id types, but give out ranges and entities in the "raw" base64Url format and
take raw ids as parameters.

to make this easier, we implement the conversion in the public CacheStorage::getRangeForList
implementation and use the private OfflineStorage::getRange method internally.
2024-08-27 17:16:27 +02:00
map
2d24bab6f9 MailSet support (static mail listIds)
In order to allow importing of mails we replace legacy MailFolders
(non-static mail listIds) with new MailSets (static mail listIds).
From now on, mails have static mail listIds and static mail elementIds.
To move mails between new MailSets we introduce MailSetEntries
("entries" property on a MailSet), which are index entries sorted by
the received date of the referenced mails (customId). This commit adds
support for new MailSets, while still supporting legacy MailFolders
(mail lists) to support migrating gradually.

* TutanotaModelV74 adds:
  * MailSet support
  * and defaultAlarmList on GroupSettings

* SystemModelV107 adds model changes for counter (unread mails) updates

* Adapt mail list to show MailSet and legacy mails
  The list model is now largely unaware about listIds since it can
  display mails from multiple MailBags. MailBags are static mailLists
  from which a mail is only removed from when the mail is permanently
  deleted.

* Adapt offline storage for mail sets
  Offline storage gained the ability to provide cached entities
  from a list of ids.
2024-08-20 16:19:58 +02:00
nig
8cba52717a
remove BlobToFileMapping catch guards
by now, every EntityEventUpdate that was referencing this
type has expired.

fixes 1f331696cf
fixes 1919cee2f5
2024-07-31 09:25:20 +02:00
wrd
8ab3b14edd Move files to new folder structure
Co-authored-by: @rih-tutao
2024-07-26 16:42:13 +02:00
Renamed from src/api/worker/rest/EphemeralCacheStorage.ts (Browse further)