Instead of applying inbox rules based on the unread mail state in the
inbox folder, we introduce the new ProcessingState enum on
the mail type. If a mail has been processed by the leader client, which
is checking for matching inbox rules, the ProcessingState is
updated. If there is a matching rule the flag is updated through the
MoveMailService, if there is no matching rule, the flag is updated
using the ClientClassifierResultService. Both requests are
throttled / debounced. After processing inbox rules, spam prediction
is conducted for mails that have not yet been moved by an inbox rule.
The ProcessingState for not matching ham mails is also updated using
the ClientClassifierResultService.
This new inbox rule handing solves the following two problems:
- when clicking on a notification it could still happen,
that sometimes the inbox rules where not applied
- when the inbox folder had a lot of unread mails, the loading time did
massively increase, since inbox rules were re-applied on every load
Co-authored-by: amm <amm@tutao.de>
Co-authored-by: Nick <nif@tutao.de>
Co-authored-by: das <das@tutao.de>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
Implement a local machine learning model for client-side spam filtering.
The local model is implemented using tensorflow "LayersModel" to train
separate models in all available mailboxes, resulting in one model
per ownerGroup (i.e. mailbox).
Initially, the training data is aggregated from the last 30 days of
received mails, and the data is stored in a separate offline database
table named spam_classification_training_data. The trained model is
stored in the table spam_classification_model. The initial training
starts after indexing, with periodic training happening
every 30 minutes and on each subsequent login.
The model will predict on incoming mails once we have received the
entity event for said mail, moving it to either inbox or spam folder.
When users move mails, we update the training data labels accordingly,
by adjusting the isSpam classification and isSpamConfidence values in
the offline database. The MoveMailService now contains a moveReason,
which indicates that the mail has been moved by our spam filter.
Client-side spam filtering can be activated using the
SpamClientClassification feature flag, and is for now only
available on the desktop client.
Co-authored-by: sug <sug@tutao.de>
Co-authored-by: kib <104761667+kibibytium@users.noreply.github.com>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: frm <frm@tutao.de>
Co-authored-by: das <das@tutao.de>
Co-authored-by: nif <nif@tutao.de>
Co-authored-by: amm <amm@tutao.de>
This commit makes a distinction between temporary and permanent crypto
errors. We do not write entities to the offline storage if they have
a SessionKeyNotFoundError, since this is a temporary error and is
expected to go away the next time the range is updated. We now do
cache entities with other errors, since we cannot do anything about
them.
Writing entities with permanent errors will not cause UI issues, since
the CryptoMapper replaces the affected fields with default values.
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: hrb-hub <181954414+hrb-hub@users.noreply.github.com>
The error is a result of an incorrect assumption that EntityUpdateData's
instanceListId is null for ElementEntities when processing entity
updates, when in fact instanceListId is an empty string.
Changing EntityUpdateData.instanceListId's type to NonEmptyString | null
and setting it to null for ElementEntity updates early should help
prevent bugs caused by such assumptions.
Close#9420
Co-authored-by: bir <bir@tutao.de>
Co-authored-by: ivk <ivk@tutao.de>
The DefaultEntityRestCache should not process with the entity events
for types we do not put in cache. We cannot filter them in the
filteredUpdates as some of the listeners could be using the updates to
update the UI even though the entities corresponding to that type are
not stored in the offline DB (e.g. Session type & LoginSettingsViewer).
We pass the filteredUpdateEvents to the handlers which are events we
successfully processed and put in cache. This should prevent 404 errors
while processing entity events while indexing for example.
Co-authored-by: jomapp <17314077+jomapp@users.noreply.github.com>
We should make sure that we never process nor write any entry instance
from an entityUpdate to the offline database in case we encountered
_errors while decrypting the instance. Otherwise, we can end up in
an inconsistent offline database state.
When instances, such as tutanota/file(13) could not be decrypted
successfully, because the _ownerEncSessionKey == null, we were still
already putting them in the offline database when processing entity
updates. This lead to issues where attachments could never be decrypted
properly, even though the _ownerEncSessionKey is later added with a
patch entity update to the instance in the offline database. The
instance in the offline database would then be stored as decrypted
instance, however we assume that everything in the offline database
is already unencrypted. This commit fixes the issue.
Furthermore, the DefaultEntityRestCache#deleteFromCacheIfExists function
was broken, iterating over every character in an elementId,
instead of passing the actual elementId to the offline db sql query.
In case an entityUpdate, caused by an inbox rule or unread/read update,
happened while missedEntityUpdates were processed in the background,
it could happen that patches would have been applied twice. This commit
fixes the issue by also prefetching entityUpdates with patches or
instances on them, to ensure that the normal eventQueue handling
is not processing the entityUpdate a second time (because the update
is already in PrefetchStatus.Prefetched).
Co-authored-by: das <das@tutao.de>
We introduce a PrefetchStatus to account for instances not returned
by the server as early as possible. This ensures that we do not throw
in case of 404 not found or 403 not authorized error, when instances
corresponding to create or update entityUpdates have already been
deleted or permissions have been revoked.
Co-authored-by: abp <abp@tutao.de>
When processing the missed entityUpdates in EventQueue in EventBusClient
, we group entityUpdates based on typeRefs and listIds and do
loadMultiple requests instead of loading them one-by-one (prefetching).
Additionally, when the client is online, the server enriches the
WebSocket message with either the instance (in case of a CREATE event),
or with the patches list (in case of an UPDATE event) so that we do not
need to do an additional GET request and can either put the instance
into the cache or update the entry on the cache using the PatchMerger
instead.
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: das <das@tutao.de>
Co-authored-by: jomapp <17314077+jomapp@users.noreply.github.com>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: sug <sug@tutao.de>
Have indexer apply updates before batch is completed
to ensure that we have an up-to-date index (or we will
if updating fails and we retry).
Rename CustomCacheHandler's methods to avoid confusion.
Close#8902
Co-authored-by: paw <paw-hub@users.noreply.github.com>
Search extends range on login based on the OfflineTimeRangeDate
Extending search range extends OfflineTimeRangeDate
Co-authored-by: ivk <ivk@tutao.de>
Close#8549
- Introduce a separate Indexer for SQLite using FTS5
- Split search backends and use the right one based on client (IndexedDB
for Browser, and OfflineStorage everywhere else)
- Split SearchFacade into two implementations
- Adds a table for storing unindexed metadata for mails
- Escape special character for SQLite search
To escape special characters from fts5 syntax. However, simply
surrounding each token in quotes is sufficient to do this.
See section 3.1 "FTS5 Strings" here: https://www.sqlite.org/fts5.html
which states that a string may be specified by surrounding it in
quotes, and that special string requirements only exist for strings
that are not in quotes.
- Add EncryptedDbWrapper
- Simplify out of sync logic in IndexedDbIndexer
- Fix deadlock when initializing IndexedDbIndexer
- Cleanup indexedDb index when migrating to offline storage index
- Pass contactSuggestionFacade to IndexedDbSearchFacade
The only suggestion facade used by IndexedDbSearchFacade was the
contact suggestion facade. So we made it clearer.
- Remove IndexerCore stats
- Split custom cache handlers into separate files
We were already doing this with user, so we should do this with the
other entity types.
- Rewrite IndexedDb tests
- Add OfflineStorage indexer tests
- Add custom cache handlers tests to OfflineStorageTest
- Add tests for custom cache handlers with ephemeral storage
- Use dbStub instead of dbMock in IndexedDbIndexerTest
- Replace spy with testdouble in IndexedDbIndexerTest
Close#8550
Co-authored-by: ivk <ivk@tutao.de>
Co-authored-by: paw <paw-hub@users.noreply.github.com>
Co-authored-by: wrd <wrd@tutao.de>
Co-authored-by: bir <bir@tutao.de>
Co-authored-by: hrb-hub <hrb-hub@users.noreply.github.com>
Passing instances explicitly avoids the situations where some of them
might not be initialized.
We also simplified the entity handling by converting entity updates to
data with resolved types early so that the listening code doesn't have
to deal with it.
We did fix some of the bad test practices, e.g. setting/restoring env
incorrectly. This matters now because accessors for type model
initializers check env.mode.
Co-authored-by: paw <paw-hub@users.noreply.github.com>
Refactor our instance deserialization/serialization pipeline, both on
TypeScript and on Rust [sdk] to use typeId and attributeIds instead of
typeNames and attributeNames. We furthermore ignore cardinalities
on associations until the instance layer and always
store associations as arrays. This commit introduces **eventual
consistency** on the client, i.e. we are from now on always storing data
in the newest schema format (activeApplicationVersionsForWritingSum)
which ensures that all data is already available on the client after
updating the client to a newer version. This removes the need for
offline migrations on the client and also removes backward migrations
on the server. Furthermore, the server model types are now available
on the client, retrievable through the ApplicationTypesFacade. This is
our first step towards FastSync.
Co-authored-by: nig <nig@tutao.de>
Co-authored-by: abp <abp@tutao.de>
Co-authored-by: jomapp <17314077+jomapp@users.noreply.github.com>
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: sug <sug@tutao.de>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
Offline login was broken because RootInstance for TutanotaProperties is
not cached anymore. This happened in 6acc5ff because of confusion about
semantics of isCachedType() which was meant to be used for range
requests only. For non-range requests we should cache all types, except
the explicitly ignored ones.
We fixed the issue by only checking for custom id in range requests.
Close#8396
Co-authored-by: bir <bir@tutao.de>
Change CacheMode.Cache to CacheMode.ReadAndWrite, and CacheMode.Bypass
to CacheMode.WriteOnly.
All three values use the cache, so having one value be called 'Cache'
and another 'Bypass' might lead to the wrong idea of what these fields
actually do in the future, since we now have the new CacheMode.ReadOnly
mode.
Co-authored-by: BijinDev <BijinDev@users.noreply.github.com>
We want to be able to pass custom headers as well as control cache
behavior for ranged requests.
Note that ranged requests do not support not reading from cache at the
moment. This wouldn't be too hard to implement, but the functionality is
not needed for this story.
Closes#8062
Co-authored-by: BijinDev <BijinDev@users.noreply.github.com>
eml and mbox file import in desktop client
pause/resume functionality
chunking for mails and attachments to reduce server requests
testing for imap import source
wisely designed in the alps
Co-authored-by: map <mpfau@users.noreply.github.com>
Co-authored-by: jhm <17314077+jomapp@users.noreply.github.com>
Co-authored-by: Kinan <104761667+kibibytium@users.noreply.github.com>
Co-authored-by: kitsugo <hayashi.jiro@kitsugo.com>
Co-authored-by: nif <nif@tutao.de>
Co-authored-by: sug <sug@tutao.de>
Notification process on mobile does insert Mail instances into offline
db in order to show the mail preview as quickly as possible when the
user clicks on notification. In most cases it happens before login, and
it happens independently of entity update processing.
EntityRestCache does only download new list element instances when
they are within the cached range. In most cases new mails are within the
cached range as mail indexer caches mail bag ranges during initial
indexing. However, if the mail bag size has been reached there will be a
new mail bag with mail list that is not cached by the client (there is
no range stored for this new list).
EventQueue optimizes event updates. If CREATE event is followed by an
UPDATE event they will be folded into a single CREATE event.
In a situation where the email has been updated by another client after
it has been inserted into offline db by notification process but before
the login, and that email belongs to uncached mail bag, CREATE event
(and optimized away UPDATE events) for that mail would be skipped and
the user would see an outdated Mail instance.
We changed the behavior for Mail instances to be always downloaded on
CREATE event when we use Offline cache so that we do not miss those
update operations. This was the previously expected behavior because
of mail indexing and is also a common behavior now, until new mail bag
is created so it does not lead to additional requests in most cases.
Alternatively we could check if the (new) instance is already cached as
individual instance. We decided against that as it adds the delay to
event processing and also does not fit the logic of processing (why
should cache check if new instance is already cache?).
In the future we should try to optimize loading of new instances so
that the sync time does not increase from downloading single emails.
Close#8041
Co-authored-by: ivk <ivk@tutao.de>
After rotating group keys we need to access the former group key for previously encrypted instances.
Former group keys are stored on the GroupKey LET which use a customId. These instances are not
cached by default. But with the recent modification to that cache that allows caching MailSetEntries
GroupKey entites can also be cached. With this commit we enable the caching for this type.
to avoid excessive entity updates and inconsistent offline storages,
we don't send entity updates for each mail set migrated mail.
instead we detect the mail set migration for each folder and drop
its whole mail list from the offline cache.
we could fix up the database when we receive the changed folder,
but that would involve re-doing the migration locally and will
lead to very long entity event handling that might get interrupted,
breaking the offline database anyway.
Show a message if the email to be opened is no longer there.
Keep track of explicitly opened email independent of the list state.
Handle offline errors.
Fix URL and ViewSlider handling
Close#7373
Co-authored-by: jat <jat@tutao.de>
Co-authored-by: paw <paw-hub@users.noreply.github.com>
Open cached emails in the viewer before the list is loaded
Make sure ConversationViewModel displays the primary mail
This is necessary when opening email from notification while being
offline as the mail instance is in the cache from native part but not
the conversation entries.
Co-authored-by: ivk <ivk@tutao.de>
Co-authored-by: jat <jat@tutao.de>
To sort customIds within the offline database we store customIds
as base64Ext id strings in the offline storage. Once we retrieve data
from the offline storage we need to convert the customIds to
base64Url respectively, so that other parts can properly compare those
customIds. The only place where customIds are stored / processed in
base64Ext is in the offline database. Everywhere else customIds are
encoded as base64Url id strings.
getIdsInRange is special in the way that it doesn't return parsed
entities, but directly uses the index columns normally only used for
sorting the entities.
we now store custom id entities in the offline storage, which means we need to
make sure storing ranges and comparing ids works for them. in order to achieve
that, we decided to store the normally base64Url-encoded, not lexicographically
sortable ids in the sortable base64Ext format.
the Offline Storage needs to use the "converted" base64Ext ids internally everywhere
for custom id types, but give out ranges and entities in the "raw" base64Url format and
take raw ids as parameters.
to make this easier, we implement the conversion in the public CacheStorage::getRangeForList
implementation and use the private OfflineStorage::getRange method internally.
In order to allow importing of mails we replace legacy MailFolders
(non-static mail listIds) with new MailSets (static mail listIds).
From now on, mails have static mail listIds and static mail elementIds.
To move mails between new MailSets we introduce MailSetEntries
("entries" property on a MailSet), which are index entries sorted by
the received date of the referenced mails (customId). This commit adds
support for new MailSets, while still supporting legacy MailFolders
(mail lists) to support migrating gradually.
* TutanotaModelV74 adds:
* MailSet support
* and defaultAlarmList on GroupSettings
* SystemModelV107 adds model changes for counter (unread mails) updates
* Adapt mail list to show MailSet and legacy mails
The list model is now largely unaware about listIds since it can
display mails from multiple MailBags. MailBags are static mailLists
from which a mail is only removed from when the mail is permanently
deleted.
* Adapt offline storage for mail sets
Offline storage gained the ability to provide cached entities
from a list of ids.