The Piwik Server Logs Analytics script described on this page is a Python script that will parse server log files and track the data in the Piwik MySQL database. It will then let you visualize useful and interesting reports from the data mined from your logs. The standard Piwik user interface is available when you visualize your website reports (visits, page views, downloads, keywords, referring websites, etc.) .
Using Piwik to track and report your server logs means you have access to many useful features such as: 30+ web statistic reports, IP filters and exclusion, real time reports, high performance, and so much more!
There are many cases in which it can be useful to use the information contained in your server access logs and import this data into Piwik.
For example, in the following use cases, analyzing server logs with Piwik is desired:
- if you want to track and report the activity on a particular server or set of servers (for system administration purposes, QA, debugging, dealing with spammers, etc
- if you have weeks or months of server logs that you wish to import for historical analysis: Piwik will let you import months of server access logs that you can then visualize
And more! Please let us know if you use this script with another use case and we can add it to the list.
There are advantages and specificities to each tracking method.
Log files are already available so it makes sense to make the most of this information. Log files contain search engine bot & spam bot information as well as static image/css information.
Can I import data Using Piwik Server Log Analytics and using the standard Piwik JS at the same time?
- create a new website in Piwik, eg. with a name “Example.org (log files)”.
- note the idsite of this new website. You will use this website ID to import your log file data into.
- in the command line, force all requests from the log files to be recorded in a specific website ID via
The first time you run Log Analytics, you may import a lot of historical data, maybe months or years of past log data. After this data was imported, to archive all historical report data run this command once:
./console core:archive --force-all-websites --force-all-periods=315576000 --force-date-last-n=1000 --url=http://example/piwik
After the initial data log import, you likely would import log files hourly or daily into Piwik.
Put the following command in a cron to process archives after the logs are imported hourly or daily:
./console core:archive --url=http://example/piwik/
See also: How to setup reports auto archiving
Log Analytics lets you import any web server log file. In this FAQ we will focus on one particular type of logs that you may find useful to import in Piwik: the Piwik tracking API logs.
What are Piwik Tracking API logs?
piwik.php Tracking API endpoint. If you use one of the Tracking API clients to measure your mobile apps or games or desktop apps, they will also send requests to
piwik.php. Your webserver handling those requests will create access log files containing the tracking data that Piwik will collect in your database.
Uses of replaying logs
Replaying logs is very useful for example when your database server breaks down and Piwik could not write the data for a few hours. Luckily you can use your web server logs matching
/piwik.php and replay them into Piwik. Replaying logs means that the Log Analytics tool will go through each line of the log and import them in your Piwik for the correct datetime in the past. Replaying logs is also useful if you want to setup High availability Piwik.
How to replay Tracking API logs?
Firstly you would prepare a log file containing only the requests that should imported. Typically you would import only a given period of time.
To replay the tracking API logs you then call the log analytics importer with
--replay-tracking parameter such as:
./misc/log-analytics/import_logs.py --url=piwik.example.net --replay-tracking /var/log/apache2/access.log
After replaying the logs it is recommended to reprocess the data with the
core:archive console command.