I piped the logs of 20 Apache domains for 6 years into a Matomo 3 instance, creating a database several gigabytes large. The Matomo UI is unusable because it is so slow, and some reports that I wish to have are impossible to get.
One of the things I wanted to know was how often a certain URL has been accessed in the last years, and that information should be grouped by month.
It turns out that this information is split onto two database tables:
- log_action lists all URLs
- log_link_visit_action tells us when URLs were accessed.
At first we need to find the ID of the URL we are looking for in log_action:
SELECT * FROM `log_action` WHERE `name` LIKE '%the-file-i-am-looking-for.apk'
This gives us the IDs in the idaction column. That one can be searched for in the log_link_visit_action table:
SELECT SUBSTRING(server_time, 1, 7) AS DATE, COUNT(*) AS COUNT FROM `log_link_visit_action` WHERE `idaction_url` IN (1078305, 1103532, 1431512, 1432793) GROUP BY DATE ORDER BY DATE
And now we have the results:
date | count |
---|---|
2021-12 | 12 |
2022-01 | 21 |
2022-02 | 26 |
2022-03 | 8 |
Before running the second SQL query, I created an index on the log_link_visit_action.idaction_url column - that would have been too slow otherwise.