Hi, i have a few questions about the code in impl_als_csr_distr.cpp. <p dir="auto"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Victoriya, Thanks for the replacement code. It works:) <p dir="a

questions about impl_als_csr_distr.cpp about onedal HOT 6 CLOSED

oneapi-src commented on June 1, 2024

questions about impl_als_csr_distr.cpp

from onedal.

Comments (6)

Vika-F commented on June 1, 2024

@fleapapa
A1: Yes, you need to split the dataset into non-overlapping blocks prior use of the distributed version of implicit ALS.

A2: With Intel DAAL you can verify the trained model by computing, for example, an RMSE for the same training data set. In order to do this, compute RMSE between the training data set and the predictions.
I attach the example that shows the flow of the computations. See testModelQuality() function in the attached example.
To align the computations with MLLib you may also provide test data set for RMSE computation. To do this, please replace transposedDataTable[0], …, transposedDataTable[nBlocks - 1] with the numeric tables that contain test ratings in CSR format.

impl_als_csr_distr_verify.zip

Best regards,
Victoriya

from onedal.

fleapapa commented on June 1, 2024

Thanks for the example code! It's very helpful.

Regarding

To align the computations with MLLib you may also provide test data set for RMSE computation. To do this, please replace transposedDataTable[0], …, transposedDataTable[nBlocks - 1] with the numeric tables that contain test ratings in CSR format.

why are test ratings placed in transposedData instead of both data and transposedData?

from onedal.

fleapapa commented on June 1, 2024

Hi Victoriya,

In your example code, i found two undefined member functions:

    size_t *colIndices = sparseBlock.getBlockColumnIndicesSharedPtr().get();
    size_t *rowOffsets = sparseBlock.getBlockRowIndicesSharedPtr().get();

The two functions, getBlockColumnIndicesSharedPtr and getBlockRowIndicesSharedPtr, seem only available in 2018 Beta. I am using 2017 release. May i just copy the latest header file @ https://github.com/01org/daal/blob/92f4dde5a1e2d7f132111588f4513cc7c4578052/include/data_management/data/csr_numeric_table.h without any negative impact to my application?

from onedal.

Vika-F commented on June 1, 2024

@fleapapa

why are test ratings placed in transposedData instead of both data and transposedData?

Both data and transposedData arrays define the same distributed numeric table. In the data array the table is split by rows (users), and in the transposedData array the table is split by columns (items). The code I have provided uses transposedData as the ground truth in the testModelQuality() function. That is why to test the quality of the trained model you need only the transposedData.

May i just copy the latest header file

It would be better not to copy a header file, but to modify the example to make it work with DAAL 2017. Please replace those two lines of code with the following code:

    size_t *colIndices = sparseBlock.getBlockColumnIndicesPtr();
    size_t *rowOffsets = sparseBlock.getBlockRowIndicesPtr();

Best regards,
Victoriya

from onedal.

fleapapa commented on June 1, 2024

Victoriya,

Thanks for the replacement code. It works:)

However, afterward my app crashed with error on a call to free(). [i didn't call free():] If i comment out testModelQuality (thus mergePredictions too), then no crash.

I'm investigating the crash, and found it most likely with incorrect shape of the matrix 'predictions'. I put some logging messages which show as follow:

predictedRatings[0][0]: 1360, 2500
predictedRatings[0][1]: 1360, 2500
predictedRatings[0][2]: 1360, 2500
predictedRatings[0][3]: 1358, 2500
predictedRatings[1][0]: 1360, 2500
predictedRatings[1][1]: 1360, 2500
predictedRatings[1][2]: 1360, 2500
predictedRatings[1][3]: 1358, 2500
predictedRatings[2][0]: 1360, 2500
predictedRatings[2][1]: 1360, 2500
predictedRatings[2][2]: 1360, 2500
predictedRatings[2][3]: 1358, 2500
predictedRatings[3][0]: 1360, 2499
predictedRatings[3][1]: 1360, 2499
predictedRatings[3][2]: 1360, 2499
predictedRatings[3][3]: 1358, 2499

while predictions' is allocated to be in a shape of (5438, 9998). I don't know why it is not 9999, because my input matrix is in a shape of (5438, 9999).

However, even i manually change the statement

  HomogenNumericTable<float> predictions(nItems, nUsers, NumericTable::doAllocate);

HomogenNumericTable predictions(9999, nUsers, NumericTable::doAllocate);

The code still crashes.

By the way, final RMSE is 0.79 which is unreasonable high (with SPARK ML, it is 0.11 only:).

Most likely incorrect shape of the matrixes is the culprit of these issues. I'm hunting it...

from onedal.

fleapapa commented on June 1, 2024

Victoriya,

After making the following two changes to your example code, finally i got it working:

Use dataTables[] instead of transposedDataTables[] in testModelQuality()
Modify mergePredictions() according to the shapes of predictedRatingsMaster[][]

RMSE is only reduced to 0.32 (from previous 0.79) and still higher than that obtained using SPARK ML, but i am very happy because my app works and doesn't crash now. And, it's much faster than pyspark!

I will port my 'model selection' Python code used with SPARK to my DAAL ALS C++ app and see if by tuning some hyper-parameters i can get a RMSE as close as 0.11 :)

I am attached my changed code, FYI.
my-daal-als-changes.txt

I'm closing this issue.

Many thanks again!

from onedal.

questions about impl_als_csr_distr.cpp about onedal HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent