> I am making a decision on behalf of my users/visitors to share that data to parties for whom it will not be anonymous or aggregate, and it isn't a decision I take lightly.
A suggestion: that decision should be evaluated under the assumption that:
1) data doesn't go away (any data collected or sent to a 3rd party is usually permanent)
2) theft and accidental leaks happen, and
3) we don't know the worst ways data - of any type - can be abused, because those techniques haven't been invented yet (powerful analysis techniques are being invented at an incredible rate).
The combination of these properties means that collecting and storing data creates unbounded risk. At any point in the future someone might invent a truly horrific way to abuse the stored data that was collected perhaps decades earlier.
Humans are used to information being transient. Information decayed over time as memories were forgotten, paper/parchment/etc decayed over time. Books had to be copied to they risked being lost forever when the library burned. Claude Shannon's digital signals fundamentally changed all of that as they made it possible to automatically preserve information perfectly. Unfortunately, human intuition hasn't caught up to the idea of permanent data.
The question "Should I trust $THIRD_PARTY with this data?" misses the full nuance of what is actually being risks. A better question is "Should we trust $THIRD_PARTY and anybody who buys/steals/subpoenas/etc it from $THIRD_PARTY with this data? What if they have analysis capabilities far more advanced than current techniques?".
A suggestion: that decision should be evaluated under the assumption that:
1) data doesn't go away (any data collected or sent to a 3rd party is usually permanent)
2) theft and accidental leaks happen, and
3) we don't know the worst ways data - of any type - can be abused, because those techniques haven't been invented yet (powerful analysis techniques are being invented at an incredible rate).
The combination of these properties means that collecting and storing data creates unbounded risk. At any point in the future someone might invent a truly horrific way to abuse the stored data that was collected perhaps decades earlier.
Humans are used to information being transient. Information decayed over time as memories were forgotten, paper/parchment/etc decayed over time. Books had to be copied to they risked being lost forever when the library burned. Claude Shannon's digital signals fundamentally changed all of that as they made it possible to automatically preserve information perfectly. Unfortunately, human intuition hasn't caught up to the idea of permanent data.
The question "Should I trust $THIRD_PARTY with this data?" misses the full nuance of what is actually being risks. A better question is "Should we trust $THIRD_PARTY and anybody who buys/steals/subpoenas/etc it from $THIRD_PARTY with this data? What if they have analysis capabilities far more advanced than current techniques?".