Search through more than a hundred articles on every aspect of User.com

Joanna Kulawik
Written by Joanna Kulawik

Deduplicate users

You might have several users, who in fact are the same person. Learn how to remove all duplicates and keep all your valuable data.


Deduplication is an additional service we offer to help you address issues with duplicate users in your database. This service is performed on demand, once you request it and accept the conditions outlined in this document.

The deduplication process is managed by our Customer Support team through User.com. To initiate the process, you must request it via chat and specify the frequency at which you want the action repeated. Our specialists will configure the deduplication process according to your needs. It can be scheduled to run once every 1, 7, or 30 days, or at any other interval you require.

The process is executed overnight CET. Therefore, if you request deduplication today, the earliest you can expect to see results is tomorrow.

How does it work?

The system scans all your users and groups together those with identical email addresses. It then determines which user will be considered the 'base user'- the user whose timeline and automation logs will be preserved as the original.

How does the system recognize the 'base user'?

The logic behind selecting the 'base user' in the deduplication process is illustrated in the graph below. Click the image to zoom in:

Summary of the process:

1.User ID Attribute Value Priority: The highest priority is given to records with a User ID attribute value. This is an optional, unique identifier that you can assign to records in your database. This document provides detailed information about what the User ID attribute is.

Note that the system cannot merge records with different User IDs, as it treats them as distinct users, regardless of whether they share the same email address.

However, situations may arise where multiple records share the same email address and also have User IDs. Also, what happens if no record has a User ID?

2. Status Attribute as the Next Determinant: If the User ID is not sufficient, the next criterion is the Status attribute. This default attribute can have one of two fixed values: User or Visitor. The "User" status is automatically assigned to a record whose email address has been updated in User.com via website code. It can be also updated manually or via import. All other records are assigned "Visitor" status.

Remember, that User has the priority over the Visitor.

There may still be cases where this is not enough, for example, if there are multiple records with both User ID and User status. The graph presented earlier includes additional scenarios that are not definitive.

3. Last Updated Date: If the above conditions are insufficient, the next factor considered is the Last Updated date. Ultimately, the record with the older Last Updated date is chosen.

What happens when the "base user" is selected?

1. Merging User Attributes: If the 'base user' lacks certain attribute values (such as phone number, city, OS, etc., or any custom attributes), these will be sourced from the other records. If multiple records are being merged into the 'base user' and they have different values for the same attributes, the earlier logic applies:

User status takes priority over Visitor status, and if necessary, the record with the older Last Updated date is chosen.

2. Merging Timelines: All chat conversations, sent emails, registered events, assigned deals, tags, etc., will be retained and attributed to the base user.

Page visits will be aggregated. If there is conflicting information about the last contact, last seen, or last heard from dates, the system will use the most recent values.

NOTE:

Please be aware that automation logs are not merged. This means that in some cases, automation may stop for a specific user after deduplication, or in other cases, it may be triggered a second time.

Example 1.

If you have two users, Y and X, and user Y had started an automation before deduplication but did not become the 'base user', the final user will not retain the automation logs from user Y, and as a result, will not continue the automation process.

Example 2.

On the other hand, if user Y had undergone some automation triggered by Added to List/Tag Added/Client Attribute Changed triggers before deduplication and did not become the 'base user', after deduplication, the final user will receive the tags/lists/attribute updates and will trigger this automation again.

How to avoid duplicates?

To prevent duplicates, it is crucial to correctly implement the User ID. User ID is a standard attribute that is initially empty, and you can assign any value to it to uniquely identify your users. Find more information about this topic in this article.

Related articles:

Categories: