The year starts out with fun times in Privacy and Surveillance. In the 2020 Year in Review, it's been clear that from a data perspective, the space is split into consumer privacy and government surveillance, each of which has a different intent and method of obtaining our information. (Though they do start to come together when intelligence agencies buy private location data).
In Consumer Privacy, the buying and selling of data has become increasingly prevalent in our day to day lives as entire companies and processes have been built on this marketplace.
Fines. Grindr fined $11.7M under for tagging and selling data without users' consent. The fine was levied by Norwegian DPA, and was especially consequential to protect users in hostile LGBTQ countries.
More Fines. Epsilon fined $150M by US Department of Justice for selling data that was used for fraud schemes for millions of customers.
What is Personal Info anyway? California's CPRA (California Privacy Rights Act) is updating the definition of personal information to be more specific in some cases, and broader regrading publicly available personal information
Apple swinging its weight. Apple plans to enforce ad tracking transparency this spring, continuing the fight against big advertising platforms. The upshot is they are planning on making device level tracking (IDFA) opt in, and without companies can't tie together activity on the same phone.
Privacy simplified. Apple also released an easy-to-understand "A Day in the Life of Your Data" pamphlet that shows all the data tracking that happens on a simple father daughter trip to the park.
In Government Surveillance, the government has connected with private vendors to establish facial recognition capabilities. It's still early on, but here is where lines become drawn (or are being pushed every single day, depending on your perspective).
Who’s skirting the law now? The Markup reports that police have ways to work around facial recognition bans through technicalities such as using the data and software from neighboring police forces. (Also: Minneapolis is the next city to go for a facial recognition ban)
Can't have too much legislation. FastCompany reports on the efforts of Amnesty International in advocating for facial recognition bans in NYC. Great read on grassroots efforts.
What's Scary is in the Military. Wired reports on Palantir's Gods Eye view of Afghanistan. Using an aerostat balloons, military intel teams used ABI (activity based intelligence) to identify individuals based on daily routines + facial recognition, and used it to cover the troop patrols across multiple cities. We can hope/fight to make sure that doesn't happen to US citizens.
What's Scarier is Emotion Detection. A city in India (Lucknow, UP) to deploy cameras that use facial recognition to detect harassment. By reading emotions through facial recognition, the cameras claim to be able to identify when women are in distress. This is a pretty huge red flag that raises alarms on what this could lead to, and implications of bad implementation.
Boring but important. Europe Convention 108 is drafting facial recognition guidelines. A pretty generic report, probably, but shows movement and a summary of the space.
As privacy-invasive processes start to creep into our lives, the worst case is to slowly see a shift in our own behaviors that become more conservative, more self-conscious, and more paranoid overall. We are already seeing consumers protect themselves with moves from WhatsApp to Telegram/Signal, consumer privacy search like DuckDuckGo, but it will be a multi-dimensional struggle for years to come.
Small Bytes
Airbnb talks about designing guardrails for experimentation. A great framework and way to think about decision making once an experiment is actually being run.
Elastic is in a drama-filled battle with Amazon as the ElasticSearch parent changed the ES licensing to a much more restricted model. This was in order to combat Amazon's perceived predatory usage of open source technology.
PrestoSQL is renamed to Trino by the Trino team in order to work around Facebook's licensing of the Presto name.
Shopify team writes on how to make dashboards using product thinking approach. A dashboard for end users is a tiny product, with different audiences, usage patterns, and goals. I think this is part of the future, as these modular interfaces are micro-products for others.
Facebook AI releasing Multilingual Librispeech - 50,000 hours of audio across 8 languages.
Maia is a chess engine that plays more like a human. The neural network tries to emulate human behavior instead of Stockfish and Leela which trained by playing themselves.
Uber talks about Metric Standardization through uMetric. The blog post talks about Uber's adventure in building its custom metrics platform and how it thought about the user flow throughout the process.
Carbon Emissions vs. Car Cost - the NYT does a great post on carbon emissions and how they relate to car prices when including maintenance and fuel.
Our World in Data has a pretty great COVID vaccinations data explorer. It’s interactive, and shows just how far Israel is from everyone else.
Corpo Updates
There are always tons of updates on launches, releases, and tips so this is a new section for company PR that I find interesting. Feel free to send tips!
Dagster releases 0.10.0 with exactly once semantics, sensors, and improved infra (link)
How to simplify queries with Snowflake's new QUALIFY statement (basically a HAVING for window functions) (link)
Databricks blogs on delta lake for financial realtime use cases (link)
What's new in Tableau 2020.4 (link)
Industry and Fundraising
Author's Note: I missed several weeks, so putting in some of the notable fundraises throughout January in here as well.
Kili Technology - $7M for data labeling and annotation
Wingcopter - $22M for delivery drones up to 120km range
Sitetracker - $42M to track infrastructure projects
Chronosphere - $43M for cloud monitoring with Prometheus + Grafana
Pax8 - $96M for platform to deploy cloud services for customers
Quantum Metrics - $200M for feature replay and product usage analytics
Cockroach Labs - $160M for distributed production database CockroachDB
Equifax acquires Kount for $640M for AI-driven fraud prevention tools
Snap acquires Ariel AI for 3D rendering of human models on the phone
SAP acquires Signavio - for analyzing business processes
Qualcomm to acquire Nuvia for 5G capabilities
“This Week in Data” is a weekly newsletter to help you stay up to date with developments in the data ecosystem. My goal is to bring focus on broader data trends to data professionals and enthusiasts who are interested in data and its applications. Topics include infrastructure, AI/ML, experimentation, analytics/BI, privacy, security.
(Was this forwarded to you? Subscribe)