The GDPR is the _General Data Protection Regulation _by the European Union. Its purpose is to protect data of all European residents. Protecting data is also an intrinsic value of a developer. Protecting data in a row/column data structure is relative easy by controlling access to columns and rows. But what about free text?
In order to fulfil our privacy requirements we can adapt the content of a free text field en replace privacy related information by tags. The meaning of the text is not altered but it cannot be related to an individual through anonymization. The goal is translate the following text (date is Dutch):
The possibilities have increased since 2014, especially compared to2012, hè Kees? The system has different functions to manipulate data. The date is 12–01–2021 (or 12 jan 2021 or 12 januari 2021).
You can reach me at [email protected] and I live in Rotterdam. My address is Maasstraat 13, 1234AB. My name is Thomas de Vries and I have Acne. Oh , I use ranitidine for this.
and replace it with
The possibilities have increased since , especially compared to, hè ? The system has different functions to manipulate data. The date is (or or ).
You can reach me at and I live in . My address is , . My name is and I have . Oh , I use for this.
This article describes a simple privacy filter that will perform the following actions:
The last two are added since medical information requires extra care. The number of occurrences will be low but the impact is big when this information is leaked.
#pii #gdpr-compliance #privacy-protection #regular-expressions #python #remove personal information from text with python