influence.influence¶
References
[1] | Pang Wei Koh and Percy Liang “Understanding Black-box Predictions via Influence Functions” ICML2017 |
-
class
Influence
(workspace, feeder, loss_op_train, loss_op_test, x_placeholder, y_placeholder, test_feed_options=None, train_feed_options=None, trainable_variables=None)[source]¶ Influence Class
Parameters: - workspace (str) – Path for workspace directory
- feeder (InfluenceFeeder) – Dataset feeder
- loss_op_train (tf.Operation) – Tensor for loss function used for training. it may includes regularization.
- loss_op_test (tf.Operation) – Tensor for loss function for inference.
- x_placeholder (tf.Tensor) – Data place holder Tensor from tf.placeholder()
- y_placeholder (tf.Tensor) – Target place holder Tensor from tf.placeholder()
- test_feed_options (dict) – Optional parameters to run loss operation in testset
- train_feed_options (dict) – Optional parameters to run loss operation in trainset
- trainable_variables (tuple, or list) – Trainable variables to be used If None, all variables are trainable Default: None
-
upweighting_influence
(*args, **kwargs)[source]¶ - Calculate influence score of given training samples that affect on the test samples
- Negative value indicates bad effect on the test loss
Parameters: - sess (tf.Session) – Tensorflow session
- test_indices (list) – Test samples to be used. Influence on these samples are calculated.
- test_batch_size (int) – batch size for test samples
- approx_params (dict) –
Parameters for inverse hessian vector product approximation Default:
{‘scale’: 1e4, ‘damping’: 0.01, ‘num_repeats’: 1, ‘recursion_batch_size’: 10, ‘recursion_depth’: 10000} - train_indices (list) – Training samples indices to be calculated.
- num_total_train_example (int) – Number of total training samples used for training, which might be different from the size of train_indices
- force_refresh (bool) – If False, it calculates only when test samples and parameters are changed. Default: False
Returns: numpy.ndarray
-
upweighting_influence_batch
(*args, **kwargs)[source]¶ Iteratively calculate influence scores for training data sampled by batch sampler Negative value indicates bad effect on the test loss
Parameters: - sess (tf.Session) – Tensorflow session
- test_indices (list) – Test samples to be used. Influence on these samples are calculated.
- test_batch_size (int) – batch size for test samples
- approx_params (dict) –
Parameters for inverse hessian vector product approximation Default:
{‘scale’: 1e4, ‘damping’: 0.01, ‘num_repeats’: 1, ‘recursion_batch_size’: 10, ‘recursion_depth’: 10000} - train_batch_size (int) – Batch size of training samples
- train_iterations (int) – Number of iterations
- subsamples (int) – Number of training samples in a batch to be calculated. If -1, all samples are calculated (no subsampling). Default: -1
- force_refresh (bool) – If False, it calculates only when test samples and parameters are changed. Default: False
Returns: numpy.ndarray