influence.influence

References

[1]Pang Wei Koh and Percy Liang “Understanding Black-box Predictions via Influence Functions” ICML2017
class Influence(workspace, feeder, loss_op_train, loss_op_test, x_placeholder, y_placeholder, test_feed_options=None, train_feed_options=None, trainable_variables=None)[source]

Influence Class

Parameters:
  • workspace (str) – Path for workspace directory
  • feeder (InfluenceFeeder) – Dataset feeder
  • loss_op_train (tf.Operation) – Tensor for loss function used for training. it may includes regularization.
  • loss_op_test (tf.Operation) – Tensor for loss function for inference.
  • x_placeholder (tf.Tensor) – Data place holder Tensor from tf.placeholder()
  • y_placeholder (tf.Tensor) – Target place holder Tensor from tf.placeholder()
  • test_feed_options (dict) – Optional parameters to run loss operation in testset
  • train_feed_options (dict) – Optional parameters to run loss operation in trainset
  • trainable_variables (tuple, or list) – Trainable variables to be used If None, all variables are trainable Default: None
upweighting_influence(*args, **kwargs)[source]
Calculate influence score of given training samples that affect on the test samples
Negative value indicates bad effect on the test loss
Parameters:
  • sess (tf.Session) – Tensorflow session
  • test_indices (list) – Test samples to be used. Influence on these samples are calculated.
  • test_batch_size (int) – batch size for test samples
  • approx_params (dict) –

    Parameters for inverse hessian vector product approximation Default:

    {‘scale’: 1e4, ‘damping’: 0.01, ‘num_repeats’: 1, ‘recursion_batch_size’: 10, ‘recursion_depth’: 10000}
  • train_indices (list) – Training samples indices to be calculated.
  • num_total_train_example (int) – Number of total training samples used for training, which might be different from the size of train_indices
  • force_refresh (bool) – If False, it calculates only when test samples and parameters are changed. Default: False
Returns:

numpy.ndarray

upweighting_influence_batch(*args, **kwargs)[source]

Iteratively calculate influence scores for training data sampled by batch sampler Negative value indicates bad effect on the test loss

Parameters:
  • sess (tf.Session) – Tensorflow session
  • test_indices (list) – Test samples to be used. Influence on these samples are calculated.
  • test_batch_size (int) – batch size for test samples
  • approx_params (dict) –

    Parameters for inverse hessian vector product approximation Default:

    {‘scale’: 1e4, ‘damping’: 0.01, ‘num_repeats’: 1, ‘recursion_batch_size’: 10, ‘recursion_depth’: 10000}
  • train_batch_size (int) – Batch size of training samples
  • train_iterations (int) – Number of iterations
  • subsamples (int) – Number of training samples in a batch to be calculated. If -1, all samples are calculated (no subsampling). Default: -1
  • force_refresh (bool) – If False, it calculates only when test samples and parameters are changed. Default: False
Returns:

numpy.ndarray