Giter Site home page Giter Site logo

arlib's Introduction

ARLib

An open-source framework for conducting data poisoning attacks on recommendation systems, designed to assist researchers and practitioners. This repo is released with our survey paper on poisoning attack against recommender system.

Members:
Zongwei Wang, Chongqing University, China, [email protected]
Hao Ma, Chongqing University, China, [email protected]
Chenyu Li, Chongqing University, [email protected]

Supported by:
Prof. Min Gao, Chongqing University, China, [email protected]
ARC Training Centre for Information Resilience (CIRES), University of Queensland, Australia

Framework

Alt text

Usage

  1. Two configure files attack_parser.py and recommend_parser.py are in the directory named conf, and you can select and configure the recommendation model and attack model by modifying the configuration files.
  2. Run main.py.

Implemented Models

Recommend Model Paper
GMF Yehuda et al. Matrix Factorization Techniques for Recommender Systems, IEEE Computer'09.
WRMF Hu et al.Collaborative Filtering for Implicit Feedback Datasets, KDD'09.
NCF He et al. Neural Collaborative Filtering, WWW'17.
NGCF Wang et al. Neural Graph Collaborative Filtering, SIGIR'19.
LightGCN He et al. Lightgcn: Simplifying and powering graph convolution network for recommendation, SIGIR'2020.
SSL4Rec Yao et al. Self-supervised learning for large-scale item recommendations. CIKM'2021.
NCL Lin et al. Improving graph collaborative filtering with neighborhood-enriched contrastive learning. WWW'2022.
SGL Wu et al. Self-supervised Graph Learning for Recommendation, SIGIR'21.
SimGCL Yu et al. Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation, SIGIR'22.
XSimGCL Yu et al. XSimGCL: Towards extremely simple graph contrastive learning for recommendation, TKDE'23.
Attack Model Paper Case
NoneAttack N/A Black
RandomAttack Lam et al. Shilling Recommender Systems for Fun and Profit. WWW'2004 Black
BandwagonAttack Gunes et al. Shilling Attacks against Recommender Systems: A Comprehensive Survey. Artif.Intell.Rev.'2014 Black
AUSH Lin C et al. Attacking recommender systems with augmented user profiles. CIKM'2020. Gray
LegUP Lin C et al. Shilling Black-Box Recommender Systems by Learning to Generate Fake User Profiles. IEEE Transactions on Neural Networks and Learning Systems'2022. Gray
GOAT Wu et al. Ready for emerging threats to recommender systems? A graph convolution-based generative shilling attack. Information Sciences'2021. Gray
FedRecAttack Rong et al. Fedrecattack: Model poisoning attack to federated recommendation. ICDE'2022. Gray
A_ra Rong et al. Poisoning Deep Learning Based Recommender Model in Federated Learning Scenarios. IJCAI'2022. Gray
PGA Li et al. Data poisoning attacks on factorization-based collaborative filtering. NIPS'2016. White
DL_Attack Huang et al. Data poisoning attacks to deep learning based recommender systems. arXiv'2021 White
PipAttack Zhang et al. Pipattack: Poisoning federated recommender systems for manipulating item promotion. WSDM'2022. Gray
RAPU Zhang et al. Data Poisoning Attack against Recommender System Using Incomplete and Perturbed Data. KDD'2021. White
PoisonRec Song et al. Poisonrec: an adaptive data poisoning framework for attacking black-box recommender systems. ICDE'2021. Black
CLeaR Wang et al. Poisoning Attacks Against Contrastive Recommender Systems. arXiv'2023 White
GTA Wang et al. Revisiting data poisoning attacks on deep learning based recommender systems. ISCC 2023 Black

Implement Your Model

Determine whether you want to implement the attack model or the recommendation model, and then add the file under the corresponding directory.

If you have an attack method, make sure:

  1. Whether you need information of the recommender model, and then set self.recommenderGradientRequired=True.
  2. Whether you need information of training recommender model, and then set self.recommenderModelRequired=True.
  3. Reimplement function posionDataAttack()

If you have a recommender method, reimplement the following functions:

  • init()
  • posionDataAttack()
  • save()
  • predict()
  • evaluate()
  • test()

Downlod Dataset

BAIDU DISK
Link: https://pan.baidu.com/s/1Gw0SI_GZsykPQEngiMvZgA?pwd=akgm
key: akgm

Google Drive
Link: https://drive.google.com/drive/folders/1QLDickAMEuhi8mUOyAa66dicCTd40CG5?usp=sharing

Requirements

base==1.0.4
numba==0.53.1
numpy==1.18.0
scipy==1.4.1
torch==1.7.1

Reference

If you find this repo helpful to your research, please cite our paper.

@article{wang2024poisoning,
  title={Poisoning Attacks against Recommender Systems: A Survey},
  author={Wang, Zongwei and Gao, Min and Yu, Junliang and Ma, Hao and Yin, Hongzhi and Sadiq, Shazia},
  journal={arXiv preprint arXiv:2401.01527},
  year={2024}
}

arlib's People

Contributors

coderwzw avatar m1234543 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

arlib's Issues

Missing of the poisoning attack evaluation?

Thanks for the code!

I wonder is there are evaluation on the attack result, for example, the recommendation rate for the target item? I have run the code and did not find any output for it.

Wondering the reason why filter the data with score >=4 only in ml-1m

Hi, thanks for the efforts and the interesting project. When I try to do some experiment on ML-1m dataset, I found that the size of dataset is not as big as raw data.

After tracing your code, I found in "ARLib/data/clean/ml-1M/split.py", there is a if-else to select rating only bigger than 4.

with open('ratings.dat') as f:
    for line in f:
        items = line.strip().split('::')
        new_line = ' '.join(items[:-1])+'\n'
        if int(items[-2])<4:
            continue
        num=random.random()
        if num > 0.2:
            train.append(new_line)
        elif num > 0.1:
            val.append(new_line)
        else:
            test.append(new_line)

And I'm wondering about why you do that.
Thanks again for collecting these model and attack method, it helps me a lot!!

Some question about random attack

Thanks for your repo. It helps me a lot.
I found that you give the item which will be attacked a rating 1.0 in /ARLib/attack/Black/RandomAttack.py.

def posionDataAttack(self):
        uNum = self.fakeUserNum
        row, col, entries = [], [], []
        for i in range(uNum):
            # fillerItemid = random.sample(set(range(self.itemNum)) - set(self.targetItem),
            #                              self.maliciousFeedbackNum - len(self.targetItem))
            fillerItemid = random.sample(set(range(self.itemNum)) - set(self.targetItem),
                                self.maliciousFeedbackNum)
            row += [i for r in range(len(fillerItemid + self.targetItem))]
            col += fillerItemid + self.targetItem
            entries += [1 for r in range(len(fillerItemid + self.targetItem))]
        fakeRat = csr_matrix((entries, (row, col)), shape=(uNum, self.itemNum), dtype=np.float32)
        return vstack([self.interact, fakeRat])

We suppose It means that you give the attacked items at rating = 1.0.

However, I found that you also give all items in raw data at rating = 1.0 in /ARLib/util/DataLoader.py

def __create_sparse_interaction_matrix(self):
        """
        return a sparse adjacency matrix with the shape (user number, item number)
        """
        row, col, entries = [], [], []
        for pair in self.training_data:
            row += [self.user[pair[0]]]
            col += [self.item[pair[1]]]
            entries += [1.0]
        interaction_mat = sp.csr_matrix((entries, (row, col)), shape=(self.user_num,self.item_num),dtype=np.float32)
        return interaction_mat

Hence, our poison train data became this:
image

I'm wondering the reason why you copy the raw data and rewrite it to 1.0 in poison data, but not its original rating. For example in clean/train.txt:
image

In clean/train.txt, there are various rating like 1, 2, 3, 4, 5. But in poison data, there are only 1.0.
Thanks for your work and your patience. Look forward to your response!

Errors with PGA and RAPU_G attack

Thanks for the efforts and the interesting project. When I try to test PGA and RAPU_G attack on SGL model, I met following errors:

For PGA attack, I met 3 errors:

  1. Error: TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0].
    s = torch.tensor(self.interact[self.controlledUser, :]).cuda()

I tried to change it to s = torch.tensor(self.interact[self.controlledUser, :].toarray()).cuda() and met the following error.

  1. Error: ValueError: could not broadcast input array from shape (9,1412) into shape (942,1412)
    ui_adj[:self.userNum, self.userNum:] = np.array(s.cpu())

I tried to change it to ui_adj[self.controlledUser, self.userNum:] = np.array(s.cpu()) and met the following error.

  1. Error: File "./attack/PGA.py", line 51, in posionDataAttack
    recommender.model._init_uiAdj(ui_adj + ui_adj.T)
    File "./recommend/SGL.py", line 245, in _init_uiAdj
    self.sparse_norm_adj = TorchGraphInterface.convert_sparse_mat_to_tensor(self.sparse_norm_adj).cuda()
    File "./recommend/SGL.py", line 334, in convert_sparse_mat_to_tensor
    coo = X.tocoo()
    AttributeError: 'numpy.ndarray' object has no attribute 'tocoo'

recommender.model._init_uiAdj(ui_adj + ui_adj.T)

ARLib/recommend/SGL.py

Lines 243 to 245 in edc4cff

self.sparse_norm_adj = sp.diags(np.array((1 / np.sqrt(ui_adj.sum(1)))).flatten()) @ ui_adj @ sp.diags(
np.array((1 / np.sqrt(ui_adj.sum(0)))).flatten())
self.sparse_norm_adj = TorchGraphInterface.convert_sparse_mat_to_tensor(self.sparse_norm_adj).cuda()

ARLib/recommend/SGL.py

Lines 310 to 314 in edc4cff

def convert_sparse_mat_to_tensor(X):
coo = X.tocoo()
i = torch.LongTensor([coo.row, coo.col])
v = torch.from_numpy(coo.data).float()
return torch.sparse.FloatTensor(i, v, coo.shape)

I tried to add a line after

ARLib/recommend/SGL.py

Lines 243 to 244 in edc4cff

self.sparse_norm_adj = sp.diags(np.array((1 / np.sqrt(ui_adj.sum(1)))).flatten()) @ ui_adj @ sp.diags(
np.array((1 / np.sqrt(ui_adj.sum(0)))).flatten())

self.sparse_norm_adj = sp.coo_matrix(self.sparse_norm_adj) but find the training loss called by
grad = recommender.train(requires_adjgrad=True, Epoch=self.attackEpoch)
become nan:

training: 1 batch 0 rec_loss: nan cl_loss nan
training: 1 batch 100 rec_loss: nan cl_loss nan
evaluating the model...
Progress: [++++++++++++++++++++++++++++++++++++++++++++++++++]100%
Quick Ranking Performance (Top-50 Item Recommendation)
Current Performance
Epoch: 1, Hit Ratio:0.11035349004127042 | Precision:0.014352392065344223 | Recall:0.11574197836148567 | NDCG:0.053679494051791274
Best Performance
Epoch: 1, Hit Ratio:0.11035349004127042 | Precision:0.014352392065344223 | Recall:0.11574197836148567 | NDCG:0.053679494051791274
./util/DataLoader.py:77: RuntimeWarning: divide by zero encountered in power
d_inv = np.power(rowsum, -0.5).flatten()
./util/DataLoader.py:77: RuntimeWarning: divide by zero encountered in power
d_inv = np.power(rowsum, -0.5).flatten()
training: 2 batch 0 rec_loss: nan cl_loss nan
training: 2 batch 100 rec_loss: nan cl_loss nan
evaluating the model...
Progress: [++++++++++++++++++++++++++++++++++++++++++++++++++]100%

Also, I tried FedRecAttack that also call the recommender.model._init_uiAdj() func. It works well without error. So I guess there is something wrong with PGA code but could not find the root for the problem.

For RAPU_G attack, the "higher" module is not included, which confilcts with

import higher

Moreover, I think it would be awsome if you could provide a table of the attack effectiveness of those included attack for reference. In that way, users could have an awareness of whether they are using the repository correctly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.