Rebuild the Sitecore Reporting Database the Right Way
Usually the Sitecore Reporting database needs to be rebuilt for various reasons (corruption, upgraded Sitecore version, missing Experience Analytics data, etc. ) as it does not contain correct data. How you prepare for the rebuild, what happens during rebuild, and afterwards matter.
Prepare for a Reporting Database Rebuild
Before you click the button to start a rebuild, you need to have a secondary Reporting database and the proper connectionStrings. Specifically, you would follow the steps in the “Create and configure a secondary reporting database” of the following Sitecore KB: https://doc.sitecore.com/developers/90/sitecore-experience-platform/en/rebuild-the-reporting-database.html
Note, that the only two roles that require a reporting.secondary connectionString are the CMS and Processing roles. If you log into your CMS and don’t see a reporting connectionString, don’t be alarmed. In speaking with Sitecore Support we discovered that the CMS role should not have a reporting connectionString in a scaled environment (it would in a non-scaled environments).
However, Sitecore SIF deploys a CMS with the reporting connectionString, while Sitecore PaaS (Azure AppServices) don’t have the connectionString. I have asked Sitecore to work internally to clarify this across the Product Teams.
IMPORTANT: If you are not in the midst of a rebuild, you need to comment out the reporting.secondary connectionString to avoid writing to both databases.
Execute a Reporting Database Rebuild
Once you have prepared the environment, you need to uncomment the reporting.secondary connectionString. Whatever database name is in the reporting.secondary connectionString is the one that will get rebuilt. Whatever is in the reporting connectionString will not… and post rebuild these databases will need to be swapped so the rebuilt database is properly referenced.
This has lead to confusion as during a rebuild, or shortly after, users see data in Experience Analytics and believe that some sort of automatic swapping has occurred (more on that later).
Per Sitecore Support “The reporting.secondary contains live data starting from the rebuilding start and it will contain all the historical data as well. When the rebuild is finished the reporting is still missing data (or has corrupted data) while the reporting.secondary has the correct version, that is why you need to swap the two.” In essence, as the rebuild process is occurring the reporting database may get some data pumped in as well as both connectionStrings are active, but this is not the database you want to use post build.
If you are ready to execute, navigate to /sitecore/admin/RebuildReportingDB.aspx of your CMS and start the rebuild, or use the Administration Tools link in the Control Panel to access the Reporting Database Rebuild page. This process can take several hours or days depending on your amount of data and server power. Unfortunately, there is not a progress bar or estimated completion.
Post Reporting Database Rebuild Steps
Post rebuild, you will need to switch the database name in the reporting and reporting.secondary connectionStrings to reflect which reporting database is the active/rebuilt one.
Lets assume you have two reporting databases (one named reportingdb in the reporting connectionString and the other named reportingdb-secondary in the reporting.secondary connectionString).
Once the rebuild is complete, the reporting connectionString after rebuild should point to the reportingdb-secondary database. This should only be done on the roles that have a reporting connectionString (not the reporting.secondary connectionString) of the Processing and Reporting roles.
You will need to update the reporting.secondary connectionStrings to the non-active reporting database. In the example above, your reporting.secondary would now point to the reportingdb database after a rebuild. This would be done only on the roles that have the reporting.secondary connectionStrings of CMS and Processing.
In essence, anytime a rebuild is achieved, you will need to edit the connectionStrings to point the reporting connectionString to the active/just rebuilt database and the reporting.secondary to the non-active/not yet rebuilt database.
IMPORTANT: Updating the connectionStrings will cause Application Pool recycles so be aware of the impact first as your CMS, Processing, and Reporting roles will have momentarily downtime during the manual swap.
As noted in the instructions, you will also want to comment out the reporting.secondary connectionString when not in use as “If you do not comment out or remove the reporting.secondary connection string, data is continually written to both databases.”