Skip to main content

Regulating data sharing is heating up

With the U.K. report on Facebook, and the stern language within it, the train on regulating data sharing may finally reach the station this year. The FTC is also likely to impose a stiff fine on Facebook for violating a consent decree. So let's learn more about this data sharing business. If you prefer a video, the gist of this post can be heard here. *** First, let's talk about data flows and the "cloud". Data are stored in computers that are called servers. In the cloud computing model, these servers are owned - not by the companies that collect the data - but by large tech companies like Amazon, Google, Microsoft, etc. who are responsible for managing the servers. These servers are geographically dispersed and so when data enter the cloud, they get replicated and spread to many servers. The technical benefit of such replication is recoverability of the data (allowing the use of cheaper, less reliable computers) but now, the data become much harder to delete. Data become more telling if one combines different datasets measuring different aspects of our lives. For example, an auto insurer may have data on past claims and that data help predict your future claims. But if the auto insurer is able to get data from say an automaker about your car, e.g. how fast you drive, where you drive, etc., that data combined with past claims improve the predictive power. Thus, a data-sharing industry has been created. Companies make agreements to share data with one another. This becomes much easier in the "cloud" as those servers are already connected to one another. These agreements may include explicit payments but even if they don't, both sides must be benefiting commercially from the arrangement, or else they would not exist. So when company A shares data with company B, the data flow from A servers to B servers. B may also use a cloud, which then means the data would be replicated yet again, and dispersed geographically onto yet another set of servers.  And company B may also share data with company C, etc., etc. *** An inexplicable part of the consent decree between Facebook and the FTC is the requirement that Facebook monitor what happened to the data after they are shared with third parties. I just can't figure out how that is possible. It isn't even possible within Facebook: if a user demands that his/her be deleted, it will be very hard to ensure that all copies of the data are deleted from every server, including data that might have landed in an analyst's computer. In fact, most analysts probably don't know how many replicates of data elements are being created during the analysis, and where those replicates exist! *** The next question of general interest is all the different ways in which tech companies collect people's data without people realizing what's happening. In the video, I look at contact lists, personality tests, 2-factor authentication schemes, IOT devices, etc. in their roles as data collectors.  This is the reason why the video is called "Did you betray your friend today?"

from Big Data, Plainly Spoken (aka Numbers Rule Your World) https://ift.tt/2Ej83Ub
via IFTTT

Comments

Popular posts from this blog

Controlling legend appearance in ggplot2 with override.aes

[This article was first published on Very statisticious on Very statisticious , and kindly contributed to R-bloggers ]. (You can report issue about the content on this page here ) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. In ggplot2 , aesthetics and their scale_*() functions change both the plot appearance and the plot legend appearance simultaneously. The override.aes argument in guide_legend() allows the user to change only the legend appearance without affecting the rest of the plot. This is useful for making the legend more readable or for creating certain types of combined legends. In this post I’ll first introduce override.aes with a basic example and then go through three additional plotting scenarios to how other instances where override.aes comes in handy. Table of Contents R packages Introducing override.aes Adding a guides() layer Using the guide argument in scale_*() Changing multiple aesthetic par...

Using RStudio and LaTeX

(This article was first published on r – Experimental Behaviour , and kindly contributed to R-bloggers) This post will explain how to integrate RStudio and LaTeX, especially the inclusion of well-formatted tables and nice-looking graphs and figures produced in RStudio and imported to LaTeX. To follow along you will need RStudio, MS Excel and LaTeX. Using tikzdevice to insert R Graphs into LaTeX I am a very visual thinker. If I want to understand a concept I usually and subconsciously try to visualise it. Therefore, more my PhD I tried to transport a lot of empirical insights by means of  visualization . These range from histograms, or violin plots to show distributions, over bargraphs including error bars to compare means, to interaction- or conditional effects of regression models. For quite a while it was very tedious to include such graphs in LaTeX documents. I tried several ways, like saving them as pdf and then including them in LaTeX as pdf, or any other file ...