TY - JOUR
T1 - More hybrid and secure protection of statistical data sets
AU - Herranz, Javier
AU - Nin, J.
AU - Solé, Marc
N1 - Funding Information:
funded by the ESF, from Spanish MICINN Ministry. Partial support by the Spanish MEC (projects ARES - CONSOLI-DER INGENIO 2010 CSD2007-00004 - eAEGIS - TSI2007-65406-C03-02 and MTM2009-07694) is acknowledged.
PY - 2012
Y1 - 2012
N2 - Different methods and paradigms to protect data sets containing sensitive statistical information have been proposed and studied. The idea is to publish a perturbed version of the data set that does not leak confidential information, but that still allows users to obtain meaningful statistical values about the original data. The two main paradigms for data set protection are the classical one and the synthetic one. Recently, the possibility of combining the two paradigms, leading to a hybrid paradigm, has been considered. In this work, we first analyze the security of some synthetic and (partially) hybrid methods that have been proposed in the last years, and we conclude that they suffer from a high interval disclosure risk. We then propose the first fully hybrid SDC methods; unfortunately, they also suffer from a quite high interval disclosure risk. To mitigate this, we propose a postprocessing technique that can be applied to any data set protected with a synthetic method, with the goal of reducing its interval disclosure risk. We describe through the paper a set of experiments performed on reference data sets that support our claims.
AB - Different methods and paradigms to protect data sets containing sensitive statistical information have been proposed and studied. The idea is to publish a perturbed version of the data set that does not leak confidential information, but that still allows users to obtain meaningful statistical values about the original data. The two main paradigms for data set protection are the classical one and the synthetic one. Recently, the possibility of combining the two paradigms, leading to a hybrid paradigm, has been considered. In this work, we first analyze the security of some synthetic and (partially) hybrid methods that have been proposed in the last years, and we conclude that they suffer from a high interval disclosure risk. We then propose the first fully hybrid SDC methods; unfortunately, they also suffer from a quite high interval disclosure risk. To mitigate this, we propose a postprocessing technique that can be applied to any data set protected with a synthetic method, with the goal of reducing its interval disclosure risk. We describe through the paper a set of experiments performed on reference data sets that support our claims.
KW - Statistical data sets protection
KW - hybrid methods
KW - interval disclosure risk
KW - synthetic methods
UR - http://www.scopus.com/inward/record.url?scp=84864753381&partnerID=8YFLogxK
U2 - 10.1109/TDSC.2012.40
DO - 10.1109/TDSC.2012.40
M3 - Article
AN - SCOPUS:84864753381
SN - 1545-5971
VL - 9
SP - 727
EP - 740
JO - IEEE Transactions on Dependable and Secure Computing
JF - IEEE Transactions on Dependable and Secure Computing
IS - 5
M1 - 6189360
ER -