We need to provide data with labels as training data for the computer; since labeling data is not available in PaySim, let’s make a hypothesis that clients who have participated in more than 10 fraudulent transactions have a high probability as fraudsters, and we label them as fraudsters. We also mark add labels to the suspects we identified in the previous chapter as fraudsters.
On the other hand, we use the Lovain algorithm to group clients. For clients with the fraudster probability of less than 0.065, we mark them as non-frauds.
Calculates the nodes of low-dimensional vector in the graph with the Node Embedding algorithm. These vectors, also called embeddings, can be used for machine learning.
Fast Random Projection (FastRP) is a kind of node embedding algorithm in random projection algorithms. We can project n vectors of any dimension into O(log(n)) dimensions and still maintain the pairwise distance between nodes.
The major difference between using Neo4j ML and the classic ML is that we can use the Graph Algorithm in Neo4j such as Centrality or Node Embedding to add features in the graph. In short, we increase the accuracy of the model by introducing more graph-based features to in feature engineering.