Summary: Data intensive large-scale distributed systems like peer-to-peer (P2P) networks are becoming increasingly popular where centralization of data is impossible for mining and analysis. Unfortunately, most of the existing data mining algorithms work only when the data can be accessed in its entirety. Finding all the network-wide frequent item sets is computationally difficult and usually has large communication overhead in such environment. This paper focuses on developing a communication efficient algorithm for discovering frequent item sets from a P2P network. A sampling-based approach is adopted to find approximate solutions instead of exact solutions with probabilistic guarantee. The benefit of the approximation technique is reflected in the low communication overhead in discovering the majority of frequent item sets with probabilistic guarantee. The main principle followed by the algorithm assumes that an independent and identically distributed (iid) sample of the entire data is available at one location to generate a set of candidate item sets. Collecting iid samples from a P2P network is a challenging problem because of varying degrees of connectivity and sizes of the data shared. The paper first addresses this issue and shows how an iid sample of nodes and data can be collected from a P2P network using random walks. It applies the proposed sampling technique to identify most of the frequent item sets from a P2P network. Theoretical analysis shows how to decide about the optimum sample size and minimize communication to compute the results. Experimental results show that the proposed algorithm discovers all of the network-wide frequent item sets using communication that scales sublinearly with the network and data size.