<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<item>
  <id>05758559</id>
  <dt>j</dt>
  <an>05758559</an>
  <augroup>
    <au>Matatov, Nissim</au>
    <au>Rokach, Lior</au>
    <au>Maimon, Oded</au>
  </augroup>
  <ti>Privacy-preserving data mining: a feature set partitioning approach.</ti>
  <so>Inf. Sci. 180, No. 14, 2696-2720 (2010).</so>
  <py>2010</py>
  <pu>Elsevier Science Inc. (North-Holland), New York, NY</pu>
  <lagroup>
    <la>EN</la>
  </lagroup>
  <ccgroup>
  </ccgroup>
  <utgroup>
    <ut>data mining</ut>
    <ut>privacy</ut>
    <ut>genetic algorithms</ut>
    <ut>$k$-anonymity</ut>
    <ut>feature set partitioning</ut>
  </utgroup>
  <cigroup>
  </cigroup>
  <ligroup>
    <li>doi:10.1016/j.ins.2010.03.011</li>
  </ligroup>
  <abgroup>
    <ab>Summary: In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on $k$-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than $k$ subjects. The most common approach for achieving compliance with $k$-anonymity is to replace certain values with less specific but semantically consistent values. In this paper we propose a different approach for achieving $k$-anonymity by partitioning the original dataset into several projections such that each one of them adheres to $k$-anonymity. Moreover, any attempt to rejoin the projections, results in a table that still complies with $k$-anonymity. A classifier is trained on each projection and subsequently, an unlabelled instance is classified by combining the classifications of all classifiers.Guided by classification accuracy and $k$-anonymity constraints, the proposed data mining privacy by decomposition (DMPD) algorithm uses a genetic algorithm to search for optimal feature set partitioning. Ten separate datasets were evaluated with DMPD in order to compare its classification performance with other $k$-anonymity-based methods. The results suggest that DMPD performs better than existing $k$-anonymity-based algorithms and there is no necessity for applying domain dependent knowledge. Using multiobjective optimization methods, we also examine the tradeoff between the two conflicting objectives in PPDM: privacy and predictive performance.</ab>
    <rv></rv>
  </abgroup>
</item>