Web Analytics Privacy in Piwik

This guide explains the privacy implications of tracking your visitors’ web analytics data, and how Piwik can easily be configured to ensure that your users’ privacy is respected.

Piwik ensures the privacy of your users and analytics data. When using Piwik, YOU keep control of your data, nobody else does. Your data is stored in your own MySQL database, and logs or report data will never be sent to other servers by Piwik.

In March 2011, the Independent Center for Privacy Protection in Germany (ULD) recommended Piwik as privacy-compliant web analytics software. Piwik privacy compliance is also reflected by the many government agencies who already trust and rely on Piwik (in Europe, Asia, and North America) for providing self-hosted web analytics.

To ensure further security, after you have installed Piwik, we recommend that you:

  1. make your Piwik server more secure by undertaking a few extra security checks
  2. follow the guide below to enable important Privacy features

How to Enable Privacy Settings in Piwik

By design, Piwik ensures that your analytics data is only accessible to the Piwik administrator, meaning it is completely secure. This guide will explain how to easily make your favourite web analytics tool “privacy compliant”. Firstly, you will need to log in as Super User and click on Settings > Privacy.

Step 1) Automatically Anonymize Visitor IPs

By default, Piwik stores the visitor IP address (ipv4 or ipv6 format) in the database for each new visitor. If your user has a static IP address this means his browsing history could be easily tracked across several days and even across websites tracked within the same Piwik server.

To ensure that you do not store the visitor IP, which is Personally Identifiable Information (PII), please go to Settings > Privacy to enable IP anonymization, with at least 2 bytes masked from the IP.

Step 2) Delete Old Visitors Logs

You can configure Piwik to automatically delete your older logs from the database. For privacy reasons, we highly recommend that you keep the detailed Piwik logs for only 3 to 6 months and delete older log data.

Deleting old logs also has one other important advantage: it will free significant database space, which will, in turn, slightly increase performance!

If you run the automatic script as explained in the FAQ, it is safe to delete your old log data and still access all historical reports in Piwik.

Step 3) Include a Web Analytics Opt-Out Feature on Your Site (Using an iFrame)

On your website, in your existing privacy policy page or in the ‘Legal’ page, you can actually add a way for your visitors to “opt-out” of being tracked by your Piwik server. By default, all of your website visitors are tracked, but if they opt-out by clicking on the iframe link, a cookie ‘piwik_ignore’ will be set. All visitors with a piwik_ignore cookie will not be tracked.

In Settings > Privacy, you will be able to copy and paste the following Iframe code:

Here below is the example iframe for this website. You can opt out from being tracked on demo.piwik.org:

Step 4) Respect DoNotTrack preference

Do Not Track is a technology and policy proposal that enables users to opt out of tracking by websites they do not visit, including analytics services, advertising networks, and social platforms. By default, Piwik respects users preference and will not track visitors which have specified “I do not want to be tracked” in their web browsers. For more information about DoNotTrack, check out donottrack.us.

Step 5) Optional Privacy Preferences

  • As the Piwik administrator, you may decide that giving access to real time & visitor log features are not necessary for your Piwik users. In this case, you can disable the Live plugin in Settings > Plugins.
  • If you track a number of websites with the same Piwik server, all your websites’ code will contain the Piwik server URL in the Javascript code. To prevent other users from finding out all your websites, you can Hide the Piwik Server URL in your JavaScript using this technique (FAQ)
  • Some countries legislation require websites to control which cookies they set based on user preferences. You can easily disable all Piwik Cookies for a particular visitors or for all visitors by calling a Javascript function in the Piwik code, see the FAQ: How do I disable tracking cookies?.

Privacy Applied to Web Analytics Logs: a Philosophical Choice?

Privacy on the Internet is a major concern for many users, webmasters and companies today. We spend so much time online that access to our Internet activity logs (websites, pages visited, internet searches) can reveal a lot of personal information about ourselves, our life, and work.

When you use a web analytics tool such as Google Analytics (GA) or Yahoo Analytics, your web analytics data is tracked, stored and owned by the company providing you with the free analytics service. While they provide an excellent service for free, they also re-use the visitor log data tracked on your website to enrich existing profiles for a given user or IP address.

Here are just a few examples of how websites can gather your personal information while you are away from the website:

Google Example

If you use Google Analytics on your website, Google would not only know all of the IP addresses (and other browser unique identifiers) of visitors to your website (and which pages they looked at on your site), but because most other websites use GA as well (or another Google product), Google would also, for example, know the 6 other websites that person visited earlier that day and the 367 websites he looked at in the last month. Because more than 50% of all websites on the Internet use GA, Google Adsense or another Google product using tracking beacons, Google is able to build a very accurate picture of most websites visited by any given user.

Facebook Example

Facebook Social widgets are used on already more than 15% of all websites & blogs (source w3techs). When you visit a website with a Facebook Like button (or any other FB functionality) your browser will send data (and your IP address) to Facebook. Recently it was made public that Facebook creates shadow profiles of logged out users. This means that even if you are logged out or not a Facebook member, they still keep track of the websites and articles your IP address (and other browser unique identifiers) was looking at.

Why is this profiling for marketing purposes considered a “problem” for some Piwik users?

Privacy is becoming increasingly important to us as we spend more of our lives ‘connected’ on the internet. While Google is providing a multitude of amazing services for free – for which we are very thankful – we are at the same time concerned about where and how our private information is being used. Websites like Google (and increasingly Facebook) are able to build an enormous profile of all websites and pages looked at by most (nearly all) Internet users worldwide (even if they are not Google or Facebook users).

One of their main goals is to improve the re-marketing of Google Ads and Facebook Ads to Internet users and find the right advertising segments for the right ad. But many Internet users and website operators are growing concerned about what could be termed a Global Internet User Activity Database and its moral implications. You don’t need to be a Privacy Junkie to be interested in the challenges and moral implications of gathering so much data on the Internet. We choose not to discuss the details here but recommend you check out the Privacy section of the EFF website to learn more.

Opting out of all data collection & Remain Anonymous online

To opt-out of all web tracking technologies (including Google Analytics and Piwik), using an extension such as Noscript or Ghostery is the safest way: these browser extensions will disable all the known JavaScript trackers and ensure that your browser does not send a request to external tracking servers. If you wish to browse the Internet without your IP address being tracked at all, please consider using Tor Browser which will automatically connect you to the secured and anonymous Tor network. If you wish to stay completely anonymous online, please see the PDF guide How to remain anonymous online?

Conclusion

Piwik is the leading self-hosted, privacy compliant, decentralized, modern & Free (GPL License) web analytics platform, a building block of the free and open Internet. By using Piwik and configuring a few options as explained in this guide, you will ensure that all of your valuable information is private and owned by one person (you!) and that your website also, just as importantly, respects your visitors’ privacy.