Tutorials Backfill Historical Data

Backfill Historical Data

Sending historical data is a free and fast way to improve the accuracy of your fraud predictions.

Send Historical User Data

Send 6-12 months of key events for all users (good and bad). Most customers send, when relevant, $create_order,$transaction ,$create_account, and $create_content.

Include:

  • $time the UNIX timestamp at which the event occurred (in milliseconds as an integer) - REQUIRED.
  • $ip the IP of the user at the time of the event (if available).

NOTE:

  1. Backfill your production account after testing in the sandbox.
  2. Be aware of rate limits and build in a retry.
  3. While we use up to 12 months of historical data, users with activity older than 30 days will not show up in the console.
  4. The purpose of sending historical data is to improve the accuracy of your live predictions. To this end, we process historical data differently than live data. If you want to do analysis of Sift Scores, you can do this against live data.
// Sample $create_account event
{
  // Required for backfilling
  // UNIX timestamp in milliseconds as an integer
  "$time" : 1456274104243, // Feb 24 2016 00:35:04 UTC

  "$type"             : "$create_account",
  "$api_key"          : "YOUR_API_KEY",
  "$user_id"          : "billy_jones_301",
  "$user_email"       : "bill@gmail.com",
  "$name"             : "Bill Jones",
  "$phone"            : "1-415-555-6040",
  "$ip"               : "54.208.214.78"
}
import sift

client = sift.Client(api_key='{apiKey}', account_id='{accountId}')

# Sample $create_account event
properties = {
  # Required for backfilling
  # UNIX timestamp in milliseconds as an integer
  "$time" : 1456274104243, # Feb 24 2016 00:35:04 UTC

  "$user_id"          : "billy_jones_301",
  "$user_email"       : "bill@gmail.com",
  "$name"             : "Bill Jones",
  "$phone"            : "1-415-555-6040",
  "$ip"               : "54.208.214.78"
}

response = client.track("$create_account", properties)
require "sift"

client = Sift::Client.new(:api_key => "YOUR_API_KEY")

# Sample $create_account event
properties = {
  # Required for backfilling
  # UNIX timestamp in milliseconds as an integer
  "$time" => 1456274104243, # Feb 24 2016 00:35:04 UTC

  "$user_id"          => "billy_jones_301",
  "$user_email"       => "bill@gmail.com",
  "$name"             => "Bill Jones",
  "$phone"            => "1-415-555-6040",
  "$ip"               => "54.208.214.78"
}

response = client.track("$create_account", properties)
require 'sift-php/lib/Services_JSON-1.0.3/JSON.php';
require 'sift-php/lib/SiftRequest.php';
require 'sift-php/lib/SiftResponse.php';
require 'sift-php/lib/SiftClient.php';
require 'sift-php/lib/Sift.php';

$client = new SiftClient(array('api_key' => 'YOUR_API_KEY'));

// Sample $create_account event
$properties = array(
  // Required for backfilling
  // UNIX timestamp in milliseconds as an integer
  '$time' => 1456274104243, // Feb 24 2016 00:35:04 UTC

  '$user_id'          => 'billy_jones_301',
  '$user_email'       => 'bill@gmail.com',
  '$name'             => 'Bill Jones',
  '$phone'            => '1-415-555-6040',  
  '$ip'               => '54.208.214.78'
);

$response = $client->track('$create_account', $properties);
import com.siftscience.SiftClient;
import com.siftscience.EventRequest;

SiftClient client = new SiftClient("YOUR_API_KEY");
EventRequest request = client.buildRequest(new CreateAccountFieldSet()
        .setUserId("billy_jones_301")
        .setSessionId("gigtleqddo84l8cm15qe4il")
        .setUserEmail("bill@gmail.com")
        .setName("Bill Jones")
        .setIP("54.208.214.78")
        .setTime(1456274104243)

EventResponse response;
try {
    response = request.send();
} catch (SiftException e) {
    System.out.println(e.getApiErrorMessage());
    return;
}
response.isOk(); // true

Flag Known Bad Users

After you’ve sent all event data for your users, you should tell Sift which of these users is known to be bad for the fraud type you’re fighting. Send this information to Sift with the Decisions API. Sending known bad users in the past helps Sift learn your fraud patterns from the get go.

  • To mark known bad users, you’ll send Sift a BLOCK decision category for the user. A BLOCK indicates that you’ve taken negative action against the user (eg banned from site)
  • For users that you have identified as bad by an analyst, set MANUAL_REVIEW as the Decision Source for your BLOCK Decision.
  • For users that you have not manually verified as bad, but you have banned or blocked the account due to a Chargeback, set CHARGEBACK as the Decision Source for your BLOCK Decision.

NOTE:

  • If you did not record why this list of users is known to be bad (i.e. banned for Chargeback, or verified by an analyst as bad, you can backfill all bad Decisions with a source of MANUAL_REVIEW
  • Send Decisions for all the known fraudulent users that you backfilled. Don't send 'accept' or positive Decisions, for your good users during this backfill, only send Decisions for Bad users.
// Sample Decision Event
// Requires that you configure a Decision with this ID first
// Decisions are configured in the Sift Console
// Decisions should be named based on your real business actions
{
  "decision_id"   : "ban_user_payment_abuse",
  "source"        : "MANUAL_REVIEW",
  "analyst"       : "analyst@example.com",
  "description"   : "backfill known bad users",
  "time"          : 1456274104243, // Feb 24 2016 00:35:04 UTC
}
# Sample Decision Event
# Requires that you configure a Decision with this ID first
# Decisions are configured in the Sift Console
# Decisions should be named based on your real business actions

import sift

client = sift.Client(api_key='{apiKey}', account_id='{accountId}')

applyDecisionRequest = {
    'decision_id'   : 'user_looks_ok_payment_abuse',
    'source'        : 'MANUAL_REVIEW',
    'analyst'       : 'analyst@example.com',
    'description'   : 'backfill known bad users',
    'time'          : 1456274104243, # Feb 24 2016 00:35:04 UTC
}

response = self.client.apply_user_decision(user_id, applyDecisionRequest)
# Sample Decision Event
# Requires that you configure a Decision with this ID first
# Decisions are configured in the Sift Console
# Decisions should be named based on your real business actions

require "sift"

client = Sift::Client.new(api_key: "{YOUR_API_KEY}", account_id: "accountId")

response = client.apply_decision({
  "decision_id"       => "ban_user_payment_abuse",
  "description"       => "backfill known bad users",
  "source"            =>"MANUAL_REVIEW",
  "analyst"           => "analyst@example.com",
  "user_id"           => "userId",
  "time"              => 1456274104243, # Feb 24 2016 00:35:04 UTC
})

if (!response.ok?)
  puts "Unable to apply decision: " + response.api_error_message
end
// Sample Decision Event
// Requires that you configure a Decision with this ID first
// Decisions are configured in the Sift Console
// Decisions should be named based on your real business actions

require 'sift-php/lib/Services_JSON-1.0.3/JSON.php';
require 'sift-php/lib/SiftRequest.php';
require 'sift-php/lib/SiftResponse.php';
require 'sift-php/lib/SiftClient.php';
require 'sift-php/lib/Sift.php';

$client = new SiftClient(array('api_key' => 'YOUR_API_KEY'));

$options = array(
    'analyst'       => 'analyst@example.com',
    'description'   => 'backfill known bad users',
    'time'          =>  1456274104243
);

$response = $client->applyDecisionToUser('userId',
    'ban_user_payment_abuse',
    'MANUAL_REVIEW',
    'time',
    $options);
// Sample Decision Event
// Requires that you configure a Decision with this ID first
// Decisions are configured in the Sift Console
// Decisions should be named based on your real business actions

import com.siftscience.SiftClient;
import com.siftscience.DecisionStatusResponse;
import com.siftscience.DecisionStatusRequest;
import com.siftscience.model.DecisionStatusFieldSet;

SiftClient client = new SiftClient("{YOUR_API_KEY}");
ApplyDecisionRequest request;
ApplyDecisionRequest request = client.buildRequest(
    new ApplyDecisionFieldSet()
        .setAccountId("accountId")
        .setUserId("userId")
        .setDecisionId("ban_user_payment_abuse")
        .setSource(DecisionSource.MANUAL_REVIEW)
        .setDescription("backfill known fraud users")
        .setAnalyst("analyst@example.com"))
        .setTime(1456274104243);

ApplyDecisionResponse response;
try {
    response = request.send();
} catch (SiftException e) {
    System.out.println(e.getApiErrorMessage());
}

DecisionLog decisionLog = response.getDecisionLog();

What is Unix Time?

Unix time is the amount of time passed since 00:00:00 UTC, Thursday, 1 January 1970. There are many Unix timestamp converters available online. See sample code to the right.

echo $(($(gdate +'%s * 1000 + %-N / 1000000')))
import time
time_millis = int(round(time.time() * 1000))
time_millis = (Time.now.to_f * 1000).round
$timeMillis = round(microtime(true) * 1000);
long timeMillis = System.currentTimeMillis();